They call it culturomics: the obvious play on the word “genomics” looks at trends in human thought and culture. But scientists say culturomics has been ____1____ by a lack of quantitative data. So researchers at Harvard, along with Google, Encyclopedia Britannica, and the American Heritage Dictionary, have come up with a new tool.
It’s a database of 5.2 million books, published since the year 1500. That’s four percent of all the books ever published, with a total of 500 billion words. The focus is on English language culture, so ____2____ of the books are in English.
Among the first findings of the research, published in the journal Science: about, 8500 new words enter the English language ____3____. But many of them don’t end up in dictionaries. And about ____4____—actors become famous around age 30, writers around 40, and politicians around 50. But the fame of politicians can eventually exceed that of actors.
A Google tool called the Books Ngram Viewer is ____5____ based on this data—users can track the usage and frequency of a word or phrase over the past few centuries. Thus, we can watch the fall and rise of Melville. And soon the rise and fall of Snooki.
【视听版科学小组荣誉出品】
hampered three quarters annually fame available
人们把这叫做“culturomics”(文化基因组)——这明显就是借用了"genomics"(基因组)这个单词,从相似的角度探究人类思想和文化趋势的奥秘。但科学家表示数据资料的缺失会妨碍文化基因组的工作。因此哈佛研究员和谷歌、大不列颠百科全书以及美国传统词典一起提供了一个新工具。 这个工具就是自1500年以来出版的520万本书的数据。这些书的数量占所有出版书籍的4%,有5千亿个单词。由于这些书均聚焦于英语文化,因此有3/4都是用英语写的。 研究的首个发现之一就是,每年大约有8500个新单词进入英语体系,但是很多都不会出现在字典里。这项发现已刊登在《自然》杂志上。而谈到个人成名时间——演员在30岁左右,作家在40岁左右,而政治家则在50岁左右。但是最终政治家的名声会超过演员。 一个叫Ngram阅读器的谷歌工具就是基于此数据库诞生的,使用者可以追踪某个单词或短语在过去几个世纪中的用法和使用频率。这样我们就可以看到Melville这个词先衰后盛,以及Snooki这个词先盛后衰的情况啦。^-^