If you're ever reading a book or watching a movie and get the distinct feeling you've come across the story before – or even better, can predict exactly what's going to happen next –there could be a good reason for that.

Computer scientists have sifted through the language of more than 1,700 works of fiction and discovered that English literature consists of just six kinds of emotional arcs that make up nearly all of the most well-known stories.

While literary theorists have for centuries characterised and counted the basic plots and structures that writers use in stories, it's unlikely there's ever been such a rigorous scientific analysis of English fiction like this before.

Researchers from the Computational Story Laboratory at the University of Vermont mined the complete text of some 1,737 fiction works available on Project Gutenberg, an online collection of more than 50,000 digital books in the public domain. By analysing the sentiment of language used in chunks of text 10,000 words long in each of these texts, the researchers were able to register the emotional ups and downs for the stories as a whole. Negative words like "poverty", "dead", and "punishment" dragged the sentiment down, while positive terms like "love", "peace", and "friend" brought it up.
佛蒙特大学“计算机故事实验室”的研究员们从古登堡计划(Project Gutenberg是一个线上书库,内含5万多本公版电子书)上找到了大约1737部全文小说,他们将这些文本分成文本块,每个文本块包含1万个单词,然后分析其中的语言情感,最终得出故事整体的情感起伏。“贫穷”、“死亡”、“惩罚”等消极词汇会使情感变得低落,而“爱情”、“和平”、“友谊”之类积极词汇会使情感变得高昂。

Doing this for over 1,700 books and charting the dynamics of each text, the team discovered that all stories basically boil down to one of a set number of emotional patterns. "We find a set of six core trajectories which form the building blocks of complex narratives," the authors write in their study.

According to the researchers, those six core emotional arcs are:

· "Rags to riches" (An ongoing emotional rise, eg. Alice's Adventures Under Ground)

· "Tragedy, or riches to rags" (An ongoing emotional fall, eg. Romeo and Juliet)

· "Man in a hole" (A fall followed by a rise)

· "Icarus" (A rise followed by a fall)

· "Cinderella" (Rise–fall–rise)

· "Oedipus" (Fall–rise–fall)

Interestingly, based on download statistics from Project Gutenberg, the researchers say the most popular stories are ones that use more complex emotional arcs, with the Cinderella and Oedipus arcs registering the most downloads. Also popular are works that combine these core arcs together in new ways within one story, such as two sequential "Man in a hole" arcs stuck together, or the "Cinderella" arc coupled with a tragic ending.