Facebook ran an A/B test. Get out the torches and pitchforks!

Posted on June 29, 2014 12:22 pm by | Facebook | Research

The Pitchfork and Torches Crowd Comes After Facebook Again

One of the things the Internet is best at is righteous indignation. The crowd can take up its pitchforks and torches in a heartbeat. Virtual lynchings are becoming the norm and, just as when lynchings happened IRL, they often target innocent victims. The media, quick to jump on a meme gaining velocity, spreads the stories with equal disregard for critical thinking or fact checking.

Consider fast-food chain KFC, vilified and pilloried when the crowd became outraged—outraged, I tell you—that an employee in a Mississippi restaurant asked the grandmother of a young girl disfigured in a pit bull attack to leave because her appearance was upsetting other diners. The public bashing of KFC led the chain to donate $30,000 to the girl’s care, find her a specialist, apologize, and promise action. It took a couple days to figure out that the incident never happened. In fact, the restaurant regularly serves patients at a nearby hospital whose appearance is far worse. (To its credit, and in a great PR move, KFC will stand by its donation.)

Today’s torches-and-pitchforks uprising is aimed at a popular target, Facebook. And after reading a dozen articles and analyses—most expressing horror and disgust—I can only conclude that the mob mentality in this case has infected the Net for one simple reason: It’s Facebook. But mob the has definitely formed: Search the keywords “Facebook,” “psychological,” and “experiment” in Google News and you’ll find more than 2,500 results.

News Feeds were manipulated to test user response to sentiment

You’ve probably already heard the story, and the characterizations have perhaps set your blood boiling. If not, here’s the recap:

Facebook tweaked the algorithms that dictate what users see in their News Feeds for almost 700,000 of its users to determine if the sentiment of content affected their own posts. Not surprisingly, an increase in negative tone leads users to write negative posts, while a bump in positive content results in more positive updates.

The experiment was published in the Proceedings of the National Academy of Sciences. The groundwork for the study comes from three earlier studies of how users’ emotions are affected on Facebook, none of which produced the reaction this one did (though, to be fair, those studies analyzed data without manipulating the content users saw).

The outraged and the media have labeled the study a “psychological experiment,” though the Facebook representative who led the study is a data scientist, not a psychologist. Working with the data scientist, Adam Kramer, were a Cornell University professor and a UCSF post-doctoral fellow. It’s worth pointing out that neither Kramer’s two colleagues nor Cornell or UCSF have been targeted for any abusive or inappropriate behavior. Apparently, neither saw any ethical or moral problem with the research methodology, despite their affiliation with upstanding research institutions. In fact, according to an Atlantic report, their university review boards also approved it.

But because the experiment was conduct in secret—based on Facebook’s terms of service that gives Facebook the right to conduct such experiments—the pitchfork-and-torches crowd has swarmed into the streets.

A regular practice everywhere

Meanwhile, an article posted to RepCap points out that the team at Visual Website Optmizer has been using A/B testing software for years, discovering which case studies inspire people to spread the word, produce leads, and drive sales. The article points out that this A/B testing produced one post that generated more than $13,000 in hard dollars and 92 assisted sales conversions.

Let’s be clear about A/B testing. It’s a psychological experiment to see how people respond emotionally to one approach to content versus another. A/B testing is popular among email marketers, who find the optimum subject line to get someone to open the email. A/B testing, according to Wikipedia, “is a simple randomized experiment with two variants, A and B, which are the control and treatment in the controlled experiment.”

If psychological experiments among bloggers and email marketers isn’t enough, then look to Upworthy, whose writers produce up to 25 headlines for every post, then employ testing tools to determine which will provoke the desired emotional response. That’s right: Upworthy engages in a psychological experiment with every single post it writes.

In each of these cases—and thousands upon thousands more—nobody asks the audience if it’s okay to run the experiment on them. Outrage over this standard practice is reserved for Facebook. Why? Because it’s Facebook, the social network everybody loves to hate.

When you stop to consider that the Visual Website Optimizer blog post case study convinced people to spend money, you might conclude that it’s far more nefarious than the Facebook experiment, which just wanted to figure out how people react to what they see. But, of course, that’s not how the knee-jerk crowd reacted. David Holmes, writing for Pando Daily, labeled the experiment “unethical.” He fretted over Facebook’s ability and willingness wield its power to shift users’ emotional states.

You know, like advertising tries to do with every single TV spot ever produced.

Of course, he’s also vexed over the fact that they didn’t conduct the experiment without telling users. Again, though, neither does any A/B tester anywhere, ever.

Did the test rise to the level of “psychological experimentation” as we usually think of it?

In fact, the labeling of the algorithm tweak that produced the results as a “psychological experiment”—a label never used in the National Academy of Sciences paper—is inflammatory. When we talk of psychological experiments, we’re usually talking about things like the Stanford Prison experiment, in which Philip Zimbardo cast students as prisoners and guards, result in guards abusing the prisoners who suffered anxiety and stress after just a few days. Or the Milgram Obedience Experiment, in which participants were asked to shock a subject when giving the wrong answer to a query; most were happy to deliver the highest voltage despite the pain the study subject appeared to be suffering.

A subtle and short-term adjustment to the sentiment of the posts you see on Facebook hardly rises to the level of these studies. The Facebook test is far closer to headline A/B tests to determine which delivers the desired behavior on the part of the audience. Yet labeling the Facebook test a “psychological experiment” automatically elevates it to the level of The Asch Conformity Expriment.

If Buzzworthy had run a similar experiment to see which headline you clicked and which you passed, and published the results in the same place, the collective response of the Internet probably would have been, “Huh; how about that?”

Tempest in a teacup

The arguments I’ve heard against Facebook’s behavior fall into a few categories:

  • They didn’t get permission—As I noted, nobody gets permission in A/B testing. Ever. And there’s a simple reason for that. If I tell you that I’m going to see if slightly more positive or negative posts produce a particular response, your behavior will change. The purity of the sample is polluted and the results invalidated.

  • It’s an abuse of Facebook’s enormous power—What’s the goal of the research? Is it, as VentureBeat’s Mark Sullivan asks, “to mess with people’s feeds and moods on a regular basis?” Again, it’s head-shakingly amazing to me that I even need to point out that this is what every online publisher does every single minute of every single day. Even my blog posts, tweets and Facebook updates are designed to get you to react. Those institutions with the resources conduct a cumulative billions of dollars worth of research to figure out how to do it best.

  • Facebook shouldn’t mess with your News Feed—You don’t see every post from every Facebook friend or from every page you’ve liked. (Among some Facebook marketers, there is a remarkable belief that someone who likes your brand page has signaled that they want every single update from that page.) What you get is an algorithmically curated collection based on what you’re interested in. So Facebook manipulates your News Feed and always has. The algorithm is routinely tweaked. (That’s why some marketers believe their organic reach has fallen.) I’m just not bothered that they would run a test to see if positive updates in the Feed result in positive updates from the user. (Other, that is, than the obvious, “Well, duh” reaction I had to the study.)

  • Facebook is large and it’s not a mailing list, which alters the rules—What rules? Where’s the size cutoff? Where’s the guideline that says it’s okay for this company to run A/B tests but based on this clearly delineated criteria, it’s not okay for that company to do it?

  • What chutzpah Facebook had to publish the results in a scientific journal!—Keep in mind, Cornell and UCSF researchers were involved, too, and they all thought the results were interesting. (More interesting than I did, frankly.) But clearly nobody thought it was an ethical or moral breech that would lead to a virtual lynching. (Of Facebook. Again, neither UCSF nor Cornell have been called out for their participation.)

Facebook is as matter-of-fact about the experiment as I am. Here’s what the company said in a statement to Forbes:

Tempest in a teacup

We do research to improve our services and to make the content people see on Facebook as relevant and engaging as possible. A big part of this is understanding how people respond to different types of content, whether it’s positive or negative in tone, news from friends, or information from pages they follow. We carefully consider what research we do and have a strong internal review process. There is no unnecessary collection of people’s data in connection with these research initiatives and all data is stored securely.

Ultimately, I’m untroubled by the experiment which is hardly uncommon among marketing organizations, did no harm, was not used to get people to spend money or otherwise behave differently than they otherwise might have, and did not rise to what most people consider “psychological experimentation.” It may be a tempest in a teacup, but that won’t stop the vitriol from the pitchfork-and-torch mob. After all, if you’re not hating on Facebook, you’re not one of the cool kids.

 

Comments

  • 1.Do you conduct a marketing "A/B test" with federal funding?

    These aren't exactly the same thing, Shel.

    http://egoebelbecker.me/2014/06/29/facebook-trust-and-the-clueless-engineer/

    and especially this... http://skepchick.org/2014/06/facebooks-unethical-research-project/

    And for the bigger implications... https://medium.com/message/engineering-the-public-289c91390225

    You already know, as a marketing professional whom I look up to, what you can actually accomplish with an A/B test.

    Is that really "the same" as what you could possibly accomplish by influencing mood via a social discussion platform like Facebook with a billion users (680,000 tested in this experiment), via omission of words or otherwise?

    I have yet to see the impact of how the content and context of the influenced posts changed as a result of this experiment.

    Then, should Facebook, a social platform with a billion users (even though they only "tested" 680,000), be trying to influence anything at all in the public sphere without some measure of informed consent (mood or otherwise)?

    I don't think so.

    Is this problem limited to Facebook? Probably not.

    Joseph Ratliff | June 2014

  • 2.But . . . but . . . but . . . didn't this experiment result in thousands of people committing suicide? ;)

    Rick Ladd | June 2014 | United States

  • 3.Joseph, thanks for your comment.

    As Facebook noted in its own response, what it's trying to influence is the quality of the News Feed to create a better experience for users. It's why they test all the time. I tried to emphasize that, unlike other A/B tests, this one wasn't trying to get you to buy or vote or otherwise act -- beyond the impact the items in the feed had on your mood. Advertisers try to influence your mood with every ad they present.

    Informed consent would have altered behaviors. You can't say, "Is it okay if we subtly alter your News Feed to see if more upbeat updates lead you to create more upbeat updates of your own?" It's like asking someone not to think of a pink elephant. (Go ahead. Try.)

    The size of Facebook's sample shouldn't be a criteria for whether to conduct a test. (And if it is, I ask again: What's the magic cutoff number for audience size?)

    Again, there was nothing going on here that bloggers, email marketers, other social networks and content providers don't do on a routine basis. If it's wrong, it has to be wrong for everybody. If it's not, then it's not. For everybody. (And the review boards at UCSF and Stanford didn't think so.)

    I'll stand by this. It's a yawn that has led to the unjustified, knee-jerk outrage demonstrated by the posts in your links.

    Shel Holtz | June 2014

  • 4.Every. single. time Facebook comes under fire, I tell people that when you're using a free platform, they owners are paying for it with your data. And every. single. time Facebook comes under fire, few (if any) users change their habits. I will say that Facebook's gotten much better about transparency over the years and has attempted to be more proactive. Regardless, this is a non-event. Oh and the outrage, user entitlement scenario happens on most any free platform when most any change is made. The news picks it up and we all move on with our lives.

    Kevin Dugan | June 2014 | United States

  • 5.You must realize that a test to actively change the psychological disposition of customers was in no way a normal A/B test, it does not follow the ethical guidelines of market research and I have serious questions about the legal coverage they have under their terms and conditions, which says they can "use customer data" not change it.

    In the end, it doesn't really matter what you or I think. The bottom line is that the user base is pissed and trust has been jeopardized (rightly or wrongly) which is bad business. So Shel, that's what it is -- bad business.

    Mark W. Schaefer | June 2014 | United States

  • 6.Mark, if I test one headline versus another to see which one provokes a more emotional response, I fail to see how that's any different. I'm manipulating my audience based on research.

    The user base is pissed weekly at Facebook. If this causes 1,000 people to cancel their accounts or use it differently, I'll be shocked.

    Legal authorities have said Facebook doesn't face any legal risk, and it's important to note -- as I did in the post -- that both Cornell's and UCSF's review boards authorized the test, which means they didn't see a problem with it either.

    Yawn.

    Shel Holtz | June 2014

  • 7.I'll duplicate a comment I left on Chris Penn's post here for posterity:

    This specific instance was completely unethical--no gray line. This was a psychological experiment on humans, not A/B testing of a headline. The APA code of ethics calls for informed consent. My research code of ethics calls for informed consent. There was no informed consent here. There was also no independent review of the primary data collection method--ANY academic or psychological study requires that. And this was submitted as academic research. I hold it to those standards. It's not an ethical gray area. Period.

    Tom Webster | June 2014 | United States

  • 8.Forget A/B testing, Tom.

    Let's say I'm an advertiser. I create three different versions of a commercial. My goal is to get people all weepy. (You've seen these -- starving children for World Vision, injured animals for the ASPCA, polar bears on small pieces of ice for the World Wildlife Fund.) I run a different commercial in each of three geographic areas to see which one results in more donations, then choose that one for national distribution.

    It is clearly an experiment to see if I can manipulate people's emotions.

    Sorry, I just don't see what Facebook did as any different from this at all. It happens all the time. The only difference here was that in this case, in addition to figuring out what makes for a better News Feed (as Facebook does daily with hundreds and hundreds of similar tests), a couple researchers were invited to participate so they could write a paper about it.

    It was a data experiment conducted by a data scientist, not a psychologist (which is why, among other reasons, the APA code of ethics is a non-starter), and not even the review boards at Syracuse or UCSF saw a problem with it. How do you suppose that happened -- two esteemed university review boards, meeting separately, reaching the same conclusion?

    It's worth noting, as I did in my post, that nowhere is this called a "psychological experiment" in the research paper. That's what social media called it, and what everyone has reacted to. but if this is wrong for Facebook, it's also wrong for thousands of advertisers and marketers who conduct very similar studies every day.

    Shel Holtz | July 2014

  • 9.I'm shocked you don't see a difference in a brand A/B testing its OWN content and a communication platform A/B testing YOUR personal communications. They are nowhere near the same. Yes, Facebook manipulates our news feed, but we trust it (or want to) to do so in order to surface the most relevant content, not to manipulate our emotions.

    I think you're wrong here. While Facebook may be riding high, it is long past time the company start worrying more about earning trust. It may look invincible and feel invincible, but no company can operate indefinitely with such low levels of Customer satisfaction and trust.

    Augie Ray | July 2014 | New York, NY

  • 10.Augie, there's a lot of hysteria around the notion that Facebook "manipulated" emotions. A review of the analyses by people who really understand the research leads to the conclusion that this is an inaccurate characterization of the experiment. (I summarized some of these posts here:
    http://linkd.in/1qt1Qa8)

    As one of these posts, from a post-doctoral research fellow, put it, ""The hysteria surrounding 'a Facebook manipulates your emotions' or is 'transmitting anger' story got well ahead of any sober reading of the research reported by the authors in the paper."

    A research associate in the psychology department at the University of Austin concurs:
    "...the suggestion that Facebook 'manipulated users' emotions' is quite misleading. Framing it that way tacitly implies that Facebook must have done something specifically designed to induce a different emotional experience in its users. In reality, for users assigned to the experimental condition, Facebook simply removed a variable proportion of status messages that were automatically detected as containing positive or negative emotional words."

    You may want to read this summary, and perhaps even some of the excellent posts from which they're excerpted. You may change your mind. Most people are reacting to media and social media reports, not the research itself.

    Shel Holtz | July 2014

  • 11.Shel,

    The study found that by removing "a variable portion of status messages that were automatically detected as containing positive or negative messages," users' emotions were affected. That means that, BY THE VERY DEFINITION OF THE STUDY FINDINGS, researchers manipulated peer-to-peer communications and altered people's emotion. Facebook (and you) cannot have it both ways--either the altering of people's newsfeeds resulted in no impact to users' attitudes (in which case, I agree there is no manipulation) or people's attitudes were impacted by the changes made to their newsfeeds (which absolutely means manipulation.)

    Even aside from the reality of the study findings:

    - Consumers did not volunteer for this study, as people would if this were an actual study. The fact Facebook is relying on legal language in the Terms and Conditions as a way to combat the backlash demonstrates an amazing lack of understanding (or perhaps profound lack of care) for consumer needs and expectations on the part of Facebook's.

    - By the way, I'm pretty sure some wise man once said "the customer’s perception is most definitely reality," acknowledging that consumer perception trumps logical answers in the minds of consumers. That person was you: http://holtz.com/blog/crisis-communication/your-cold-sterile-corporate-statement-doesnt-work-in-social-networks/3987/0. Facebook users feel manipulated as a result of this study, and rather than acknowledging their perception is reality, in this instance you're taking them to task. PR and communication professionals like you and I constantly remind clients and peers that consumer perception is neither right nor wrong--it simply is.

    - Lastly, if I have not convinced you, then please send me your email password. I'd like to spend a few weeks deleting some select messages that your friends and family intended for you to see. I mean, that won't be manipulation, right? Just consider it, um, A/B testing!

    Sorry, my friend, but I think you are wrong on this one. Research has shown Facebook earns little to no trust from users, and this study simply reinforces that perception. Facebook is acting cocky and invincible. I say this as a fan of Facebook: It is not invincible, and it is long past time for Facebook to start behaving in a manner that creates affinity and trust instead of daring customers to decrease their reliance on Facebook as a communication medium.

    Augie Ray | July 2014 | New York, NY

  • 12.We'll agree to disagree on this one, and a lot of people who know a lot about this space agree with me. I've reviewed a dozen articles by professional researchers who all seem to concur that this is a mountain being constructed from a molehill. To your points (and quoting from pieces by research fellows and associates):

    Manipulation -- I addressed the fact that this doesn't rise to a scientific definition of "manipulation" in my earlier comment. As for the actual impact, "The largest effect size reported had a Cohen’s d of 0.02–meaning that eliminating a substantial proportion of emotional content from a user’s feed had the monumental effect of shifting that user’s own emotional word use by two hundredths of a standard deviation. In other words, the manipulation had a negligible real-world impact on users’ behavior. To put it in intuitive terms, the effect of condition in the Facebook study is roughly comparable to a hypothetical treatment that increased the average height of the male population in the United States by about one twentieth of an inch (given a standard deviation of ~2.8 inches). Theoretically interesting, perhaps, but not very meaningful in practice."

    Consumers didn't volunteer -- "Facebook users are, in the normal course of things, not considered participants in a research study, no matter how or how much their emotions are manipulated. That’s because the HHS’s definition of research includes, as a necessary component, that there be an active intention to contribute to generalizable new knowledge...there's really no ambiguity over whether Facebook’s normal operations–which include constant randomized, controlled experimentation on its users–constitute research in this sense. They clearly don't...it’s only the fact that Kramer et al wanted to publish their results in a scientific journal that opened them up to criticism of research misconduct in the first place...Your interactions with Facebook–no matter how your user experience, data, or emotions are manipulated–are not considered research unless Facebook manipulates your experience with the express intent of disseminating new knowledge to the world." And ""...the IRB might also plausibly have found that participating in the study without Common Rule-type informed consent would not 'adversely effect the rights and welfare of the subjects,' since Facebook has limited users’ rights by requiring them to agree that their information may be used “for internal operations, including troubleshooting, data analysis, testing, research and service improvement.''

    However, Augie, I agree entirely that Facebook did a terrible job communicating this.

    As for my email? First, I haven't agreed to terms of service allowing any manipulation of my email, and second, I get every email anybody sends me (minus the spam filtered out algorithmically), not a curated collection picking and choosing email messages based on a formula designed to give me a desirable sampling.

    In any case, this isn't just me against the world. A significant number of academics who work in research have weighed in and found the controversy to be based on misunderstandings of what actually was done.

    Shel Holtz | July 2014

  • 13.I'd just like to point out that calling something a "psychological experiment" does NOT elevate it to the level of anything. It simply classifies it as what it is - a study of behaviors and thought processes.

    You make it seem in your article that all psychological experiments are extreme and can cause damage to the participants. This is blatantly false, and in fact one of the requirements of studies is that participants sign an informed consent agreement of what the study might cause such as side effects.

    Brittany T | July 2014 | United States

  • 14.Thanks for your comment, Brittany. It was not my intent to suggest all psychological experiments are extreme, only to point out that the characterization by the media was intended to make this one SOUND extreme. Nothing more!

    However, if you follow the link to the LinkedIn post referenced above, you'll find that the requirement for informed consent is not a requirement in every circumstance and that this text -- whatever you call it -- doesn't rise to the level of a psychological experiment because (among other reasons) "Facebook users are, in the normal course of things, not considered participants in a research study, no matter how or how much their emotions are manipulated. That’s because the HHS’s definition of research includes, as a necessary component, that there be an active intention to contribute to generalizable new knowledge...there's really no ambiguity over whether Facebook’s normal operations–which include constant randomized, controlled experimentation on its users–constitute research in this sense. They clearly don't...it’s only the fact that Kramer et al wanted to publish their results in a scientific journal that opened them up to criticism of research misconduct in the first place...Your interactions with Facebook–no matter how your user experience, data, or emotions are manipulated–are not considered research unless Facebook manipulates your experience with the express intent of disseminating new knowledge to the world." And ""...the IRB might also plausibly have found that participating in the study without Common Rule-type informed consent would not 'adversely effect the rights and welfare of the subjects,' since Facebook has limited users’ rights by requiring them to agree that their information may be used “for internal operations, including troubleshooting, data analysis, testing, research and service improvement.''

    (Sorry to be repeating a quote from the LinkedIn post I had already used in a comment, but I wanted to make sure it was available in the context of your comment.)

    Shel Holtz | July 2014

  • 15.Looks like it wasn't a simple A/B test after all Shel.

    http://laboratorium.net/archive/2014/09/23/facebook_and_okcupids_experiments_were_illegal

    Illegal and unethical. That, or marketers need to rethink what defines their "simple A/B tests."

    Joseph Ratliff | September 2014

  • 16.Thanks for the comment, Joseph. I read the piece you cited, and the claim that the test was illegal is only that -- a claim. The author of the post has submitted a letter. No court or government entity has ruled that it was, in fact, illegal. Facebook maintains informed consent was obtained via its terms of service, and some legal sources (not all) have agreed. But simply proclaiming something illegal is not the same as a legal entity finding that it was, in fact, a violation of the law.

    Shel Holtz | September 2014 | Concord, CA

Comment Form
What is the four-letter acronym for the Society for New Communications Reseach?

« Back