The human text

Last week in The New York Times, Shelley Podolny considered the growing amount of computer-generated text that appears online. With the dystopian title "If an Algorithm Wrote This, How Would You Even Know?" Podolny describes a study by media scholar Christer Clerwell that suggests readers may not be able to distinguish between computer- or human-generated text. Such a phenomenon speaks to growing sophistication of natural language processing software and finely-tuned algorithms that can produce humanoid content. Podolny's article was accompanied by the quiz "Did a Human or a Computer Write This?" that invites readers to test Clerwell's results. (In the interests of disclosure, the author scored 2/8.) 

The challenge to discern computer-generated texts from human ones is a Rorschach test of our perceptions of humanity. Take, for example, two excerpts from the quiz:

1) "A shallow magnitude 4.7 earthquake was reported Monday morning five miles from Westwood, California, according to the U.S. Geological Survey. The temblor occurred at 6:25 a.m. Pacific time at a depth of 5.0 miles."

2) "When I in dreams behold thy fairest shade/ Whose shade in dreams doth wake the sleeping morn/ The daytime shadow of my love betray'd/ Lends hideous night to dreaming's faded form"

The first example seems informational, offering tonal neutrality, spare language, and a just-the-facts-ma'am approach to the earthquake. Indeed, the sample text was produced by an algorithm. The second example, however, is decidedly literary: iambic pentameter verse, an abab rhyme scheme, and enough references to dreams, sleep, and shadows that one might reasonably guess this is an excerpt from one of William Shakespeare's sonnets. Yet, these lines were generated on Swiftkey, a machine-learning Android app. MIT student J. Nathan Matias trained the engine on Shakespeare's sonnets and developed a dataset of Shakespeare's words for the app to use. Matias generated these lines word-by-word, using only the suggested next-words offered by the app, producing a sonnet eerily reminiscent of Shakespeare's own. Virtual monkeys, it seems, get ever closer to randomly generating Shakespeare.  

But if the easy codes for "human" text are literary, we must ask which aesthetics, styles, and genres can "pass" as human. We would do well to remember Thomas Macauley's insistence on the relationship between English literature, taste, and intellect in colonial India, one of many instances in the history of the British Empire when humanity and the literary were pegged to one another - and to Englishness. Hallmarks of this history in the digital sphere may be one of the afterlives of colonialism. They linger in the assumptions that subtend the production and consumption of text online in the Anglophone Global North, from the shape of datasets developed to reader interpretation. While Podolny's piece speaks to the anxieties provoked by big data, algorithms that may be smarter than we are, or neoliberalism encroaching on journalism, we must attend to its unspoken question: what forms of "human" are authorized by the algorithm?