Google Book-Scanning Debate Heats Up -- Redmondmag.com

Google Book-Scanning Debate Heats Up

By The Associated Press
12/20/2006

Already facing a legal challenge for alleged copyright infringement, Google Inc.'s crusade to build a digital library has triggered a philosophical debate with an alternative project promising better online access to the world's books, art and historical documents.

The latest tensions revolve around Google's insistence on chaining the digital content to its Internet-leading search engine and the nine major libraries that have aligned themselves with the Mountain View-based company.

A splinter group called the Open Content Alliance favors a less restrictive approach to prevent mankind's accumulated knowledge from being controlled by a commercial entity, even if it's a company like Google that has embraced "Don't Be Evil" as its creed.

"You are talking about the fruits of our civilization and culture. You want to keep it open and certainly don't want any company to enclose it," said Doron Weber, program director of public understanding of science and technology for the Alfred P. Sloan Foundation.

The New York-based foundation on Wednesday will announce a $1 million grant to the Internet Archive, a leader in the Open Content Alliance, to help pay for digital copies of collections owned by the Boston Public Library, the Getty Research Institute, the Metropolitan Museum of Art.

The works to be scanned include the personal library of John Adams, the nation's second president, and thousands of images from the Metropolitan Museum.

The Sloan grant also will be used to scan a collection of anti-slavery material provided by the John Hopkins University Libraries and documents about the Gold Rush from a library at the University of California at Berkeley.

The deal represents a coup for Internet Archive founder Brewster Kahle, a strident critic of the controls that Google has imposed on its book-scanning initiative.

"They don't want the books to appear in anyone else's search engine but their own, which is a little peculiar for a company that says its mission is to make information universally accessible," Kahle said.

Google's restrictions on its digital book copies stem in part from the company's decision to scan copyrighted material without explicit permission. Google wants to ensure only small excerpts from the copyrighted material appear online -- snippets that the company believes fall under "fair use" protections of U.S. law.

A group of authors and publishers nevertheless have sued Google for copyright infringement in a year-old case that is slowly wending its way through federal court.

In contrast, the Open Content Alliance won't scan copyrighted content unless it receives the permission of the copyright owner. Most of the roughly 100,000 books that the alliance has scanned so far are works whose copyrights have expired.

Google hasn't said how many digital copies it has made since announcing its ambitious project two years ago. The company will only acknowledge that it is scanning more than 3,000 books per day _ a rate that translates into more than 1 million annually. Google also is footing a bill expected to exceed $100 million make the digital copies _ a commitment that appeals to many libraries.

The non-copyrighted material in Google's search engine can be downloaded and printed out _ a feature that the company believes mirrors the goals of the Open Content Alliance.

Although the Open Content Alliance depends on the Internet Archive to host its digital copies, other search engines are being encouraged to index the material too.

Both Yahoo Inc. and Microsoft Corp., which run the two largest search engines behind Google, belong to the alliance. The group has more than 60 members, consisting mostly of libraries and universities.

None of Google's contracts prevent participating libraries from making separate scanning arrangements with other organizations, said company spokeswoman Megan Lamb.

"We encourage the digitization of more books by more organizations," Lamb said. "It's good for readers, publishers, authors and libraries."

The motives behind Google's own book-scanning initiative aren't entirely altruistic. The company wants to stock its search engine with unique material to give people more reasons to visit its Web site, the hub of an advertising network that generated most of its $2 billion profit through the first nine months of this year.

Despite its ongoing support for the Open Content Alliance, Microsoft earlier this month launched a book-scanning project to compete with Google. Like Google, Microsoft won't allow its digital copies to be indexed by other search engines.

While Kahle says he was disappointed by Microsoft's recent move, he remains more worried about Google's book-scanning initiative because it has gathered so much attention and support.

All but one of the libraries contributing content to Google so far are part of universities. They are: Harvard, Stanford, Michigan, Oxford, California, Virginia, Wisconsin-Madison, and Complutense of Madrid. The New York Public Library also is relying on Google to scan some of its books.

The University of California, which also belongs to the Open Content Alliance, has no regrets about allowing Google to scan at least 2.5 million of the books in its libraries. "We felt like we could get more from being a partner with Google than by not being a partner," said university spokeswoman Jennifer Colvin.

But some of the participating libraries may have second thoughts if Google's system isn't set up to recognize some of their digital copies, said Gregory Crane, a Tufts University professor who is currently studying the difficulty accessing some digital content.

For instance, Tufts worries Google's optical reader won't recognize some books written in classical Greek. If the same problem were to crop up with a digital book in the Open Content Alliance, Crane thinks it will be more easily addressed because the group is allowing outside access to the material.

Google "may end up aiming for the lowest common denominator and not be able to do anything really deep" with the digital books, Crane said.

Featured

Microsoft Announces Researcher and Analyst Agents for Microsoft 365 Copilot

Microsoft on Tuesday announced two new AI-powered reasoning agents: Researcher and Analyst.
Unmasking the Adversary

Cybersecurity expert Hasain Alshakarti reveals how deep-dive threat actor analysis -- not just log reviews -- can transform incident response and stop breaches before they escalate.
Microsoft Expands Security Copilot with AI Agents

Microsoft this week announced that it is adding security-focused Copilot agents to Microsoft Security, designed to bolster organizational defenses against escalating cyber threats.
Why SQL Server Is Still Worth It

Despite its steep licensing costs, SQL Server continues to prove its worth over open-source alternatives in some key areas.
Google To Buy Cloud Security Firm Wiz for $32 Billion in Record-Breaking Deal

Google has reached an agreement to acquire cloud security startup Wiz Inc. in an all-cash deal valued at $32 billion.