In early March, the online retailer Solid Gold Bomb provoked outrage when customers discovered that its Amazon store, which featured apparel bearing dozens of variants on the “Keep Calm [and Carry On]” slogan, included a t-shirt that read “Keep Calm and Rape A Lot.” Solid Gold Bomb generated the shirts, and Amazon offered them for sale in its marketplace. To complicate matters, it appears that Amazon doesn’t review the stores in its marketplace like a mall owner might review physical storefronts, and, particularly unusual, Solid Gold Bomb didn’t review the shirts it offered for sale: the designs were computer generated. How far, then, should blame extend? When unsupervised automation produces results that everyone regrets, how do we decide whom to hold responsible, and when do we decide to hold anyone responsible in the first place?
Solid Gold Bomb’s official apology explained that its Amazon store featured millions of hypothetical shirts to be produced on-demand, should anyone order one. The “Keep Calm” debacle resulted from an automated script that generated words to approximately fit the design’s syntax and layout. The resulting list, says SGB owner Michael Fowler, “was culled from 202k words to around 1100 and ultimately slightly more than 700 were used due to character length and the fact that I wanted to closely reflect the appearance of the original slogan graphically.” Clearly, the vendor is at fault for failing to eliminate possible ending phrases to the Keep Calm slogan like “rape a lot” and “choke her” from a 700-word list. However, similarly automated practices regularly take place on a much larger scale across the internet. Determining accountability for these widespread and fundamental operations can be much less straightforward.
In some ways, Solid Gold Bomb’s generation of the offensive shirts can be seen merely as A/B testing gone awry. Offering thousands of options and printing shirts to order is a way of using user behavior to cull successful products. Presumably, if one of the quasi-randomly-generated shirts began to outstrip the others in sales, Solid Gold Bomb would have adjusted its inventory and marketing accordingly.
With A/B testing, the line between savvy capitalism and unethical business practice can get fairly nebulous. Zynga, for example, relies on a practice that CEO Mark Pincus calls “ghetto testing.” One of Zynga’s approaches to game development is to advertise games that do not yet exist, in order to test consumer response to a basic premise. Says Pincus,
“We’ll put up a link for five minutes saying, ‘Hey! Do you ever fantasize about running your own hospital?’…We’ll put that up for five minutes, and the link will maybe take you to a survey, where you give us your email and we say when this comes out we’ll contact you. If you’re really doing ghetto, it says ‘404 not found’. That’s bad. So first you try to get the heat around it, you see how much do people like it, then…”
This isn’t all that dissimilar to Solid Gold Bomb’s approach. Like Zynga’s “ghetto-tested” games, the “Rape a Lot” shirts didn’t actually exist, and would only have been produced in accordance with user demand. In fact, Solid Gold Bomb didn’t misdirect potential buyers as deliberately as Zynga’s “ghetto testing” approach does.
In large, computer-conducted A/B testing campaigns, it becomes impossible to demand human supervision of every output. Solid Gold Bomb’s 700-word list for generating T-shirts should have been thoroughly scrutinized, of course, but operations with more permutations of A’s and B’s seem less accountable for each potential outcome. For example, it’s hard to believe it would be within a webmaster’s responsibility—or even her ability—to make sure that every possible banner ad on every single page of a site doesn’t combine unfortunately with the page’s content.
A/B testing is practically ubiquitous online, and most of its applications are unequivocally benign. Wikipedia, for one, famously self-published the test results of its 2010 fundraising push. Moreover, unsupervised, computer-conducted A/B testing can produce serendipitous results that no human could ever have engineered or anticipated. The popular twitter handle @horse_ebooks, for example, began as a poorly functioning spam account intended to drive traffic to an e-book site. But its garbled messages are so striking—and occasionally poignant (cf. a recent example)—that the bot currently has over 170,000 followers.
The problem, then, is that our expectations for internet commerce haven’t quite caught up with the techniques that drive internet commerce. If a store offers things for sale that we find offensive, our typical reaction is to get mad at the store—after all, being willing to profit off an item seems to imply some kind of endorsement of that item. Today, however, these assumptions about endorsement are challenged by the ubiquity of A/B testing and other automated content generators. A “ghetto test” by Zynga might not mean that the company fully endorses a game that simulates running a hospital. Similarly, the presence of an item in the Amazon Marketplace might not be enough to presume Amazon’s approval of that item.
[Parts 2-4 will be published over the next week]
- Ben Sobel, Kendra Albert, and JZ