Dan Cohen’s Digital Humanities Blog » Blog Archive

Is Google Good for History?

[These are my prepared remarks for a talk I gave at the American Historical Association Annual Meeting, on January 7, 2010, in San Diego. The panel was entitled "Is Google Good for History?" and also featured talks by Paul Duguid of the University of California, Berkeley and Brandon Badger of Google Books. Given my propensity to go rogue, what I actually said likely differed from this text, but it represents my fullest, and, I hope, most evenhanded analysis of Google.]

Is Google good for history? Of course it is. We historians are searchers and sifters of evidence. Google is probably the most powerful tool in human history for doing just that. It has constructed a deceptively simple way to scan billions of documents instantaneously, and it has spent hundreds of millions of dollars of its own money to allow us to read millions of books in our pajamas. Good? How about Great?

But then we historians, like other humanities scholars, are natural-born critics. We can find fault with virtually anything. And this disposition is unsurprisingly exacerbated when a large company, consisting mostly of better-paid graduates from the other side of campus, muscles into our turf. Had Google spent hundreds of millions of dollars to build the Widener Library at Harvard, surely we would have complained about all those steps up to the front entrance.

Partly out of fear and partly out of envy, it’s easy to take shots at Google. While it seems that an obsessive book about Google comes out every other week, where are the volumes of criticism of ProQuest or Elsevier or other large information companies that serve the academic market in troubling ways? These companies, which also provide search services and digital scans, charge universities exorbitant amounts for the privilege of access. They leech money out of library budgets every year that could be going to other, more productive uses.

Google, on the other hand, has given us Google Scholar, Google Books, newspaper archives, and more, often besting commercial offerings while being freely accessible. In this bigger picture, away from the myopic obsession with the Biggest Tech Company of the Moment (remember similar diatribes against IBM, Microsoft?), Google has been very good for history and historians, and one can only hope that they continue to exert pressure on those who provide costly alternatives.

Of course, like many others who feel a special bond with books and our cultural heritage, I wish that the Google Books project was not under the control of a private entity. For years I have called for a public project, or at least a university consortium, to scan books on the scale Google is attempting. I’m envious of France’s recent announcement to spend a billion dollars on public scanning. In addition, the Center for History and New Media has a strong relationship with the Internet Archive to put content in a non-profit environment that will maximize its utility and distribution and make that content truly free, in all senses of the word. I would much rather see Google’s books at the Internet Archive or the Library of Congress. There is some hope that HathiTrust will be this non-Google champion, but they are still relying mostly on Google’s scans. The likelihood of a publicly funded scanning project in the age of Tea Party reactionaries is slim.

* * *

Long-time readers of my blog know that I have not pulled punches when it comes to Google. To this day the biggest spike in readership on my blog was when, very early in Google’s book scanning project, I casually posted a scan of a human hand I found while looking at an edition of Plato. The post ended up on Digg, and since then it has been one of the many examples used by Google’s detractors to show a lack of quality in their library project.

Let’s discuss the quality issues for a moment, since it is one point of obsession within the academy, an obsession I feel is slightly misplaced. Of course Google has some poor scans—as the saying goes, haste makes waste—but I’ve yet to see a scientific survey of the overall percentage of pages that are unreadable or missing (surely a miniscule fraction in my viewing of scores of Victorian books). Regarding metadata errors, as Jon Orwant of Google Books has noted, when you are dealing with a trillion pieces of metadata, you are likely to have millions of errors in need of correction. Let us also not pretend the bibliographical world beyond Google is perfect. Many of the metadata problems with Google Books come from library partners and others outside of Google.

Moreover, Google likely has remedies for many of these inadequacies. Google is constantly improving its OCR and metadata correction capabilities, often in clever ways. For instance, it recently acquired the reCAPTCHA system from Carnegie Mellon, which uses unwitting humans who are logging into online services to transcribe particularly hard or smudged words from old books. They have added a feedback mechanism for users to report poor scans. Truly bad books can be rescanned or replaced by other libraries’ versions. I find myself nonplussed by quality complaints about Google Books that have engineering solutions. That’s what Google does; it solves engineering problems very well.

Indeed, we should recognize (and not without criticism, as I will note momentarily) that at its heart, Google Books is the outcome, like so many things at Google, of a engineering challenge and a series of mathematical problems: How can you scan tens of million books in a decade? It’s easy to say they should do a better job and get all the details right, but if you do the calculations with those key variables, as I assume Brandon and his team have done, you’ll probably see that getting a nearly perfect library scanning project would take a hundred years rather than ten. (That might be a perfectly fine trade-off, but that’s a different argument or a different project.) As in OCR, getting from 99% to 99.9% accuracy would probably take an order of magnitude longer and be an order of magnitude more expensive. That’s the trade-off they have decided to make, and as a company interested in search, where near-100% accuracy is unnecessary, and considering the possibilities for iterating toward perfection from an imperfect first version, it must have been an easy decision to make.

* * *

Google Books is incredibly useful, even with the flaws. Although I was trained at places with large research libraries of Google Books scale, I’m now at an institution that is far more typical of higher ed, with a mere million volumes and few rare works. At places like Mason, Google Books is a savior, enabling research that could once only be done if you got into the right places. I regularly have students discover new topics to study and write about through searches on Google Books. You can only imagine how historical researchers and all students and scholars feel in even less privileged places. Despite its flaws, it will be the the source of much historical scholarship, from around the globe, over the coming decades. It is a tremendous leveler of access to historical resources.

Google is also good for history in that it challenges age-old assumptions about the way we have done history. Before the dawn of massive digitization projects and their equally important indices, we necessarily had to pick and choose from a sea of analog documents. All of that searching and sifting we did, and the particular documents and evidence we chose to write on, were—let’s admit it—prone to many errors. Read it all, we were told in graduate school. But who ever does? We sift through large archives based on intuition; occasionally we even find important evidence by sheer luck. We have sometimes made mountains out of molehills because, well, we only have time to sift through molehills, not mountains. Regardless of our technique, we always leave something out; in an analog world we have rarely been comprehensive.

This widespread problem of anecdotal history, as I have called it, will only get worse. As more documents are scanned and go online, many works of historical scholarship will be exposed as flimsy and haphazard. The existence of modern search technology should push us to improve historical research. It should tell us that our analog, necessarily partial methods have had hidden from us the potential of taking a more comprehensive view, aided by less capricious retrieval mechanisms which, despite what detractors might say, are often more objective than leafing rapidly through paper folios on a time-delimited jaunt to an archive.

In addition, listening to Google may open up new avenues of exploring the past. In my book Equations from God I argued that mathematics was generally considered a divine language in 1800 but was “secularized” in the nineteenth century. Part of my evidence was that mathematical treatises, which often contained religious language in the early nineteenth century, lost such language by the end of the century. By necessity, researching in the pre-Google Books era, my textual evidence was limited—I could only read a certain number of treatises and chose to focus (I’m sure this will sound familiar) on the writings of high-profile mathematicians. The vastness of Google Books for the first time presents the opportunity to do a more comprehensive scan of Victorian mathematical writing for evidence of religious language. This holds true for many historical research projects.

So Google has provided us not only with free research riches but also with a helpful direct challenge to our research methods, for which we should be grateful. Is Google good for history? Of course it is.

* * *

But does that mean that we cannot provide constructive criticism of Google, to make it the best it can be, especially for historians? Of course not. I would like to focus on one serious issue that ripples through many parts of Google Books.

For a company that is a champion of openness, Google remains strangely closed when it comes to Google Books. Google Books seems to operate in ways that are very different from other Google properties, where Google aims to give it all away. For instance, I cannot understand why Google doesn’t make it easier for historians such as myself, who want to do technical analyses of historical books, to download them en masse more easily. If it wanted to, Google could make a portal to download all public domain books tomorrow. I’ve heard the excuses from Googlers: But we’ve spent millions to digitize these books! We’re not going to just give them away! Well, Google has also spent millions on software projects such as Android, Wave, Chrome OS, and the Chrome browser, and they are giving those away. Google’s hesitance with regard to its books project shows that openness goes only so far at Google. I suppose we should understand that; Google is a company, not public library. But that’s not the philanthropic aura they cast around Google Books at its inception or even today, in dramatic op-eds touting the social benefit of Google Books.

In short, complaining about the quality of Google’s scans distracts us from a much larger problem with Google Books. The real problem—especially for those in the digital humanities but increasingly for many others—is that Google Books is only open in the read-a-book-in-my-pajamas way. To be sure, you can download PDFs of many public domain books. But they make it difficult to download the OCRed text from multiple public domain books–what you would need for more sophisticated historical research. And when we move beyond the public domain, Google has pushed for a troubling, restrictive regime for millions of so-called “orphan” books.

I would like to see a settlement that offers greater, not lesser access to those works, in addition to greater availability of what Cliff Lynch has called “computational access” to Google Books, a higher level of access that is less about reading a page image on your computer than applying digital tools to many pages or books at one time to create new knowledge and understanding. This is partially promised in the Google Books settlement, in the form of text-mining research centers, but those centers will be behind a velvet rope and I suspect the casual historian will be unlikely to ever use them. Google has elaborate APIs, or application programming interfaces, for most of its services, yet only the most superficial access to Google Books.

For a company that thrives on openness and the empowerment of users and software developers, Google Books is a puzzlement. With much fanfare, Google has recently launched—evidently out of internal agitation—what it calls a “Data Liberation Front,” to ensure portability of data and openness throughout Google. On dataliberation.org, the website for the front, these Googlers list 25 Google projects and how to maximize their portability and openness—virtually all of the main services at Google. Sadly, Google Books is nowhere to be seen, even though it also includes user-created data, such as the My Library feature, not to mention all of the data—that is, books—that we have all paid for with our tax dollars and tuition. So while the Che Guevaras put up their revolutionary fist on one side of the Googleplex, their colleagues on the other side are working with a circumscribed group of authors and publishers to place messy restrictions onto large swaths of our cultural heritage through a settlement that few in the academy support.

Jon Orwant and Dan Clancy and Brandon Badger have done an admirable job explaining much of the internal process of Google Books. But it still feels removed and alien in way that other Google efforts are not. That is partly because they are lawyered up, and thus hamstrung from responding to some questions academics have, or from instituting more liberal policies and features. The same chutzpah that would lead a company to digitize entire libraries also led it to go too far with in-copyright books, leading to a breakdown with authors and publishers and the flawed settlement we have in front of us today.

We should remember that the reason we are in a settlement now is that Google didn’t have enough chutzpah to take the higher, tougher road—a direct challenge in the courts, the court of public opinion, or the Congress to the intellectual property regime that governs many books and makes them difficult to bring online, even though their authors and publishers are long gone. While Google regularly uses its power to alter markets radically, it has been uncharacteristically meek in attacking head-on this intellectual property tower and its powerful corporate defenders. Had Google taken a stronger stance, historians would have likely been fully behind their efforts, since we too face the annoyances that unbalanced copyright law places on our pedagogical and scholarly use of textual, visual, audio, and video evidence.

I would much rather have historians and Google to work together. While Google as a research tool challenges our traditional historical methods, historians may very well have the ability to challenge and make better what Google does. Historical and humanistic questions are often at the high end of complexity among the engineering challenges Google faces, similar to and even beyond, for instance, machine translation, and Google engineers might learn a great deal from our scholarly practice. Google’s algorithms have been optimized over the last decade to search through the hyperlinked documents of the Web. But those same algorithms falter when faced with the odd challenges of change over centuries and the alienness of the past and old books and documents that historians examine daily.

Because Google Books is the product of engineers, with tremendous talent in computer science but less sense of the history of the book or the book as an object rather than bits, it founders in many respects. Google still has no decent sense of how to rank search results in humanities corpora. Bibliometrics and text mining work poorly on these sources (as opposed to, say, the highly structured scientific papers Google Scholar specializes in). Studying how professional historians rank and sort primary and secondary sources might tell Google a lot, which it could use in turn to help scholars.

Ultimately, the interesting question might not be, Is Google good for history? It might be: Is history good for Google? To both questions, my answer is: Yes.

This entry was posted on Thursday, January 7th, 2010 at 6:00 pm and is filed under Google. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

47 Responses to “Is Google Good for History?”

Brian Croxall said on January 7th, 2010 at 6:37 pm
Good piece, Dan. Everytime I consider what Franco Moretti has called “distant reading”–the ability to read the whole mountain rather than the molehill–I find myself wondering if the practice will divide historians (or literature types) into two camps. Will there be those who only close read (and enjoy the molehill-sized anecdote) and those who only distant read? Will they require such different skill sets–even if Google makes its access more open–that you choose one path or the other?
Dan Cohen: Is Google Good for History? « Keep Calm and Carry On and other Second World War Posters said on January 7th, 2010 at 7:15 pm
[...] full story Posted in Digitisation, History. Tags: Dan Cohen, Digital History, Google, History. Leave a [...]
James Smithies said on January 7th, 2010 at 11:07 pm
Excellent points Dan, and thanks for posting this before the AHA session – it made the Twitter conversation far easier to digest for those of us following online. I particularly agree with your point about Google needing to offer easier access to their OCRed text. This would offer real value to historians, but may well represent the keys to Google’s (book) kingdom? Or, possibly too, empirical evidence they ‘could do better’?
Brian Fitzpatrck said on January 7th, 2010 at 11:48 pm
Hi Dan. Honestly, “My Library” is one of those things that completely slipped under our radar. I’ll poke around and see what we can do. The content of the books themselves however would fall outside of our mission, which is directed at data that you “import into or create inside of” Google Products. Thanks for bringing this to our attention.

-Fitz
The Data Liberation Front
Johan Oomen said on January 8th, 2010 at 11:39 am
Hi Dan, good piece! In addition, I would like to point to the activities of the Open Content Alliance, administered by the Internet Archive. http://www.opencontentalliance.org/faq/ “Generally, textual material will be free to read, and in most cases, available for saving or printing using formats such as PDF.”
AHA remarks by Dan Cohen on “Is Google Good for History?” | British Studies: Past, Present, and Future said on January 8th, 2010 at 12:11 pm
[...] 1Readers should take a look at Dan Cohen's 2010 AHA paper, which he has posted to his blog at http://www.dancohen.org/2010/01/07/is-google-good-for-history/. In it, he discusses the strengths of the Google Books project as well as a few areas for [...]
Dan Cohen said on January 8th, 2010 at 12:12 pm
@Fitz: Thanks for the quick response. Being able to import/export libraries to bibliographical research tools like Zotero (which we run at the Center for History and New Media) would be great. It doesn’t address the larger problem of openness I detail here, but would be a very welcome step in the right direction.

@Johan: Yes, as I note in the piece (all too shortly) we work closely with the Internet Archive and much prefer their open model to Google’s.

@Brian: I think close and distant reading can work together. Sometimes you need the distant view to get a sense of what you might want to read in close way.
AHA 2010: Papers on Digital Humanities/History | British Studies: Past, Present, and Future said on January 8th, 2010 at 1:56 pm
[...] Dan Cohen, Comments "Is Google Good for History?" [...]
Brandon Badger said on January 8th, 2010 at 3:17 pm
Dan,

Nice to meet you yesterday.

We actually already allow 3rd parties to extract My Library data sets and edit and create them via our open APIs. Here’s the documentation on our site:

http://code.google.com/apis/books/docs/gdata/developers_guide_protocol.html

Here are some other APIs that we offer, including the ability for third parties to embed preview books, query for books and metadata, etc.

http://code.google.com/apis/books/

Also, there are free download links for all out of copyright books. Just click on the “Download” button in the toolbar and you can download either a PDF or EPUB version of the book.

-Brandon
Dan Cohen said on January 8th, 2010 at 3:44 pm
@Brandon: Many thanks for joining us historians and for putting up with the criticism. I still feel that these APIs are far more superficial than other Google APIs (for instance, there is no way to get the OCRed text of a book), but thanks also for the links to GB APIs, which have been a bit buried.
Is Google Good for History? « ResourceShelf said on January 8th, 2010 at 4:04 pm
[...] full text of Dan Cohen’s Speech, “Is Google Good for History?” is available [...]
E. Stewart Saunders said on January 8th, 2010 at 5:26 pm
Dan – I had the same experience a year ago when I tried to download the OCRed version of several Google books. Thanks for putting this problem in the public view. Fortunately, I was able to download some digitized 17th century French texts from the BN website. For those interested, the “Europeana” website offers many digitized texts. (Of course you will need to search google since I cannot remember its URL.)
Geek Media Round-Up: January 8, 2010 – Grasping for the Wind said on January 8th, 2010 at 10:02 pm
[...] Dan Cohen makes some very reasoned arguments in his pondering of the question Is Google Good for History? [...]
The Great Geek Manual » Geek Media Round-Up: January 8, 2010 said on January 8th, 2010 at 10:55 pm
[...] Dan Cohen makes some very reasoned arguments in his pondering of the question Is Google Good for History? [...]
brandon badger said on January 8th, 2010 at 11:41 pm
The EPUB versions that we allow you to download do actually have the OCR’d text in them. That’s what allows the EPUB files to reflow the text on smaller screens. It’s our PDF versions that have the page images (also free for you to download).
Allen Riddell said on January 9th, 2010 at 10:57 am
@brandon Getting text out from the EPUB files is a great boon. Is there any way to filter your searches for only those files that have the OCR’d text available?

Also, looking to the rich data you get from books scanned/hosted by the Internet Archive (creation date, upload date, scanning source, scanning software!) it seems like there’s lots of room for improvement in the google books API if Google’s as serious as the IA is about making information available to researchers.
Airminded · To-day and to-morrow said on January 9th, 2010 at 11:47 am
[...] with links to online sources for the texts and some sort of author biography, where available. Google Books has many of them, but only snippets or previews, so I’ve linked to other sources where [...]
Luke Fernandez said on January 9th, 2010 at 5:13 pm
I enjoyed this essay as well as the AHA session itself. Two questions: First, which panelist made reference to a so-called “Alexandria Complex?” and is this a widely used term? Second, if there are epistemological dangers in relying too much on Google for research (as I think some of the audience suggested) should we be petitioning Google to post messages like the following on their Web sites:

Warning: Quitting Google Now Greatly Reduces Serious Risks to Your Cognitive/Scholarly Health.

The above, or course, is meant tongue-in-cheek. But maybe Mr. Badger could consider putting somewhere on Google landing pages messages that encourage users to reflect on the benefits and limitations of relying on Google search algorithms.
Google and the historian :: in propria persona said on January 9th, 2010 at 11:28 pm
[...] via Dan Cohen’s Digital Humanities Blog » Blog Archive » Is Google Good for History?. [...]
Jim C. said on January 10th, 2010 at 8:13 pm
I would suspect that one of the reasons why Google hasn’t been more forceful about confronting the issue of orphan works is because if orphans become liberated from the conventional restrictions, then Google won’t be able to make any money from them!

As is stands now in their Agreement, all orphans will be charged for, and the monies will go into the general fund, with Google still getting their cut. So why in thw world would they want to alter this lucrative deal?

Also– one of the stipulations in their original library deals– such as with UC– is to make all PD works available for viewing. They’ve done that with pre-1923 works, but what about 1923-1964 works that were never renewed, and thus out of American copyright?

There are tons of those works out there, especially journals, yet Google is still locking them up. Why is that?

Granted, coopyright research is a hassle for them, but their UC agreement requires their release. There’s also nothing in any of their agreements that requires downloading PD works– just viewing– which make me wonder if the day will come when Google will start charging for PD downloads. I suspect they will.
Google goed voor de historici? « De Digitale lezer: Kindleblog said on January 11th, 2010 at 9:46 am
[...] goed voor de historici? By mathuizing Volgens Dan Cohen uiteindelijk [...]
Tweeting #AHA2010 « Knitting Clio said on January 11th, 2010 at 2:35 pm
[...] posts involved comments and retweets of comments on Dan Cohen’s well-received talk, “Is Google Good for Historians?” [short answer: "yes"] It was also nice to hear who won the Cliopatra awards and where all [...]
links for 2010-01-11 « Rumblegumption said on January 11th, 2010 at 8:41 pm
[...] Dan Cohen’s Digital Humanities Blog » Blog Archive » Is Google Good for History? [...]
Digital Humanities at AHA « Early Modern Online Bibliography said on January 12th, 2010 at 4:27 am
[...] Good for History? Chair: Shawn Martin, University of Pennsylvania (Panel discussion) Participants: Daniel J. Cohen, Center for History and New Media, George Mason University; Paul Duguid, University of California, [...]
Google ja historiantutkimus – Digitaalinen kirjasto said on January 12th, 2010 at 5:22 am
[...] Google Good for History? (Dan Cohen’s Digital Humanities Blog, January 7, 2010) http://www.dancohen.org/2010/01/07/is-google-good-for-history/ Cohenin vastaus otsikossa esitettyyn kysymykseen on “kyllä, tietenkin”. Hän arvostaa [...]
mw said on January 12th, 2010 at 4:25 pm
This link shows another problem with Google and history:

http://blog.historians.org/news/771/paper-of-record-disappears-leaving-%20historians-in-the-lurch

http://www.google.com/support/forum/p/news/thread?tid=1c47e6d29331dc2c&hl=en
As the professor snips the richest bud for his lapel, his scalpel of reason lies on the tray: some weekly wanderings « P e r ∙ C r u c e m ∙ a d ∙ L u c e m said on January 15th, 2010 at 7:04 am
[...] Is Google good for studying history? See also Dan Cohen’s piece. [...]
Seeing the picture » Blog Archive » Google, History & Twitter at AHA San Diego (#AHA2010) said on January 15th, 2010 at 11:13 am
[...] Cohen gave an excellent talk in a panel (Is Google Good for History) at the recent annual meeting of the American Historical [...]
Digital History at Yale » Blog Archive » Whither Google Books? said on January 16th, 2010 at 4:35 pm
[...] Diego this month tackles the question:”Is Google Good for History?” The talk, which is posted in full on Cohen’s blog, answers with a qualified but overwhelmingly positive [...]
What I’m reading ed. 100116 « The Hermitage 3.0 (Beta) said on January 16th, 2010 at 10:11 pm
[...] Is Google good for historians [...]
Susan Ferber’s Publishing Tips « History Compass Exchanges said on January 17th, 2010 at 8:54 pm
[...] continued debates about the future of print publications, Google’s interventions, the centrality of monographic work to communicating ideas in history, and the prevalence of the [...]
from google books to a digital humanities/digital history divide? « parezco y digo said on January 18th, 2010 at 11:56 am
[...] combination of Dan Cohen’s comments (posted here on his blog– read the comment section too) and those of Paul Duguid, and the evasions of [...]
The Foreign Correspondents » The goods that Google brings said on January 18th, 2010 at 4:01 pm
[...] out of fear and partly out of envy, it’s easy to take shots at Google,” Cohen wrote about the onslaught. If Google makes academicians lazy, “where are the volumes of criticism [...]
Да ли је Гугл добар за историју? | Центар за дигиталне хуманистичке науке said on January 21st, 2010 at 11:15 am
[...] енглеског превела: Ирена [...]
Heuristique et sérendipité : un exemple en images | Owni.fr said on January 21st, 2010 at 11:29 am
[...] Dan, Is Google Good for History?, [...]
Frog in a Well - The Japan History Group Blog said on February 2nd, 2010 at 12:16 am
[...] The AHA’s own roundup covers a lot of ground, including Dan Cohen’s provocative Is Google Good For History? The Historical Society had it’s own roundup of AHA news items, especially the job market [...]
research AND writing | method AND narrative « History Compass Exchanges said on February 4th, 2010 at 6:20 pm
[...] access to sources. Available technology poses challenges to previously presumed constraints on the anecdotal discovery, collection, and evaluation of sources. Blogs, websites, open-access, print-on-demand, and new journals have all contributed to new ways [...]
Comparative archives, digitization, and a project inspired by lunch at Panda Express « Digital History — Spring 2010 said on February 11th, 2010 at 6:01 pm
[...] of sources. I feel privileged to have university access to these databases, but I must agree with Dan Cohen that some similar public job needs to be undertaken in order to create similarly rich public [...]
¿Es bueno Google para los historiadores? « Clionauta: Blog de Historia said on February 17th, 2010 at 6:45 am
[...] Y se preguntó por qué tantos académicos se enfadan tanto con Google. “Si bien parece que cada dos semanas sale un libro obsesionado con Google”, se preguntó dónde estaban los volúmenes acerca de otras “grandes empresas de la información que sirven al mercado académico de manera preocupante”, argumentando que “estas empresas, que también ofrecen servicios de búsqueda y escaneos digitales, cobran cantidades exorbitantes a las universidades por el privilegio de acceso. Chupan el dinero de los presupuestos de las bibliotecas cada año, dinero que podría ir a otros usos más productivos ” (se puede consultar toda la exposición de Cohen en su blog). [...]
The Future is really good for the Past « Feral Librarian said on April 5th, 2010 at 2:38 pm
[...] The Future is really good for the Past April 5, 2010 tags: digital libraries, future of libraries, google, music, research, scholarship by Chris It has been a few months since Dan Cohen offered a convincing YES to the question Is Google Good for History? [...]
Dan Cohen said on April 6th, 2010 at 8:54 pm
An Italian translation of this article is now available.
L Kemp said on April 7th, 2010 at 5:58 pm
Hi Dan,

Thank you for your insightful article, and you elaborate on:

“For instance, it recently acquired the reCAPTCHA system from Carnegie Mellon, which uses unwitting humans who are logging into online services to transcribe particularly hard or smudged words from old books.”

How exactly would this work? Doesn’t the ‘system’ need to now what the word is to allow access to locked online services?

Cheers,
Liz
Dan Cohen said on April 8th, 2010 at 9:41 am
@Liz: No, the system uses a statistical comparison of several independent people “solving” the word to figure out (at least to a high degree of probability) what the word is. Also, there are two words, one of which the system already knows (so it can let successful solvers in).
Is Google Good for Scholarship? | FSU Digital Scholars said on September 17th, 2010 at 1:39 pm
[...] Cohen, Dan. “Is Google Good for History?” Paper presented to the American Historical Association, Jan 7 2010. http://www.dancohen.org/2010/01/07/is-google-good-for-history/ [...]
History Tech Tips #3: Top 5 Indispensible Digital History Tools « Sean Kheraj, Canadian History and Environment said on November 10th, 2010 at 12:37 am
[...] digital historian Dan Cohen remarked last year at the annual meeting of the American Historical Association, “[w]e historians are searchers [...]
Ist Google gut für die (Geschichts-) Wissenschaft? ‹ dreitehabee said on December 5th, 2010 at 6:42 pm
[...] Historiker Dan Cohen argumentiert, ohne an Kritik zu sparen, die positiven Auswirkungen von Google auf die Geschichtswissenschaft. Ich denke, dass die meisten Wissenschaften, wenn nicht offen, dann insgeheim, in Google ein [...]
Using digitised library collections to help locate literary sources « Opusculum said on December 7th, 2010 at 1:54 pm
[...] 4. Dan Cohen, “Is Google good for history?” American Historical Association Annual Meeting, San Diego, CA., 7 January 2010. Retrieved December 7, 2010 from: http://www.dancohen.org/2010/01/07/is-google-good-for-history/ [...]

Is Google Good for History?

47 Responses to “Is Google Good for History?”

Leave a Reply