Google Books, A Library of Babel


At the front of each book it’s scanned, Google proudly proclaims “This is a digital copy of a book…carefully scanned by Google as part of a project to make the world’s books discoverable online.” Despite the controversy over the settlement, most people will agree that Google’s scanning of public domain works provides an undeniable bounty and increases the store of accessible knowledge. But it’s also undeniable that Google’s scans are full of little errors, the product of the software that has to interpret the original scans and turn them into digital text.

At some level, we’ve got to allow for a margin of error, given how many hundreds of thousands of books Google has scanned, but, still, if this is the library we’re left with when all the books dissolve to dust, we might be in trouble. It’s a little like the books in Borges’ story ‘The Library of Babel,’ in which “every copy is unique, irreplaceable, but (since the Library is total) there are always several hundred thousand imperfect facsimiles: works which differ only in a letter or a comma.” Anyway, this library is full of little errors: at best, it’s really annoying; at worst, the scans misrepresent and distort their originals.

For instance, it seems doubtful that Zarathustra spake thus: “Full^is_£arth of superfluous ones….” There was no carrot and pounds symbol in the translation that was scanned.

And never mind the fact that all the line breaks are terribly screwed up, but Keats certainly didn’t mean to end “Ode on a Grecian Urn” with a random reference to “five degrees”: “Ye know on earth, and all ye need to know. 5°” In the book Google scanned, there was an annotation marking line fifty of the poem, but the scanning software turned the zero into a degrees symbol.

Similarly, in a scan of ‘Grimm’s Fairy Tales,’ you’ll find this line and many, many like it: “Grandmother will be glad to have a 10 nosegay,” she thought. ” Again, a line citation winds up in the middle of a sentence thanks to the scanning software.

So what do you think? Have you noticed these errors in the Google Books you’ve read? Do you think it’s a real problem or something that might as well be ignored?