Understanding the ways in which information achieves widespread public awareness is a research question of significant interest. We consider whether, and how, the way in which the information is phrased --- the choice of words and sentence structure --- can affect this process. To this end, we develop an analysis framework and build a corpus of movie quotes, annotated with memorability information, in which we are able to control for both the speaker and the setting of the quotes. We find that there are significant differences between memorable and non-memorable quotes in several key dimensions, even after controlling for situational and contextual factors. One is lexical distinctiveness: in aggregate, memorable quotes use less common word choices, but at the same time are built upon a scaffolding of common syntactic patterns. Another is that memorable quotes tend to be more general in ways that make them easy to apply in new contexts --- that is, more portable. We also show how the concept of "memorable language" can be extended across domains.
- Pub Date:
- March 2012
- Computer Science - Computation and Language;
- Computer Science - Social and Information Networks;
- Physics - Physics and Society;
- Final version of paper to appear at ACL 2012. 10pp, 1 fig. Data, demo memorability test and other info available at http://www.cs.cornell.edu/~cristian/memorability.html