Google is in the news today, in an article in the WSJ about their new database. Linguists say that since the invention of the printing press, about 129 million books have been published. They tracked which words show up the most, and how many times terms such as Che Guevara and Marilyn Monroe show up in the printed word. Now there’s a database with two billion words in it, drawn from 5.2 million books that have been digitized from books published since 200 years ago.
The word God, for instance, has been falling steadily in use since its peak in the 1840s. ‘Sushi’ picked up steam in the ’80s while ‘sausage’ saw its use plummet since the 1940s.
In Google’s effort to digitize so many of these books, they’ve got 15 million done so far in more than 400 languages. With the large number of people seeking to sue Google over this project the Internet giant held out an olive branch: if people go to http://ngrams.googlelabs.com they can see the various combinations of word use over time.