Allan outlined how Facebook defines hate speech:
Our current definition of hate speech is anything that directly attacks people based on what are known as their “protected characteristics”—race, ethnicity, national origin, religious affiliation, sexual orientation, sex, gender, gender identity or serious disability or disease.
There is no universally accepted answer for when something crosses the line. Although a number of countries have laws against hate speech, their definitions of it vary significantly.
There is very important academic work in this area that we follow closely. Timothy Garton Ash, for example, has created the Free Speech Debate to look at these issues on a cross-cultural basis. Susan Benesch established the Dangerous Speech Project, which investigates the connection between speech and violence. These projects show how much work is left to be done in defining the boundaries of speech online, which is why we’ll keep participating in this work to help inform our policies at Facebook.
Allan said that over the past two months, Facebook has deleted some 66,000 posts that were reported as hate speech each week, or about 288,000 per month, and he detailed why “too often we get it wrong”:
Sometimes, it’s obvious that something is hate speech and should be removed because it includes the direct incitement of violence against protected characteristics or degrades or dehumanizes people. If we identify credible threats of imminent violence against anyone, including threats based on a protected characteristic, we also escalate that to local law enforcement.
But sometimes, there isn’t a clear consensus because the words themselves are ambiguous, the intent behind them is unknown or the context around them is unclear. Language also continues to evolve, and a word that was not a slur yesterday may become one today.
Context and intent mark the most difficult things for Facebook’s content moderators to determine, and Allan offered examples of both:
What does the statement, “Burn flags not fags,” mean? While this is clearly a provocative statement on its face, should it be considered hate speech? For example, is it an attack on gay people, or an attempt to “reclaim” the slur? Is it an incitement of political protest through flag burning? Or, if the speaker or audience is British, is it an effort to discourage people from smoking cigarettes (fag being a common British term for cigarette)? To know whether it’s a hate speech violation, more context is needed.
Often a policy debate becomes a debate over hate speech, as two sides adopt inflammatory language. This is often the case with the immigration debate, whether it’s about the Rohingya in Southeast Asia, the refugee influx in Europe or immigration in the U.S. This presents a unique dilemma: On the one hand, we don’t want to stifle important policy conversations about how countries decide who can and can’t cross their borders. At the same time, we know that the discussion is often hurtful and insulting.
When the influx of migrants arriving in Germany increased in recent years, we received feedback that some posts on Facebook were directly threatening refugees or migrants. We investigated how this material appeared globally and decided to develop new guidelines to remove calls for violence against migrants or dehumanizing references to them—such as comparisons to animals, to filth or to trash. But we have left in place the ability for people to express their views on immigration itself. And we are deeply committed to making sure Facebook remains a place for legitimate debate.
People’s posts on Facebook exist in the larger context of their social relationships with friends. When a post is flagged for violating our policies on hate speech, we don’t have that context, so we can only judge it based on the specific text or images shared. But the context can indicate a person’s intent, which can come into play when something is reported as hate speech.
There are times someone might share something that would otherwise be considered hate speech but for non-hateful reasons, such as making a self-deprecating joke or quoting lyrics from a song. People often use satire and comedy to make a point about hate speech.
Or they speak out against hatred by condemning someone else’s use of offensive language, which requires repeating the original offense. This is something we allow, even though it might seem questionable since it means some people may encounter material disturbing to them. But it also gives our community the chance to speak out against hateful ideas. We revised our community standards to encourage people to make it clear when they’re sharing something to condemn it, but sometimes their intent isn’t clear, and anti-hatred posts get removed in error.
He also discussed the impact of mistakes Facebook has made in its efforts to eliminate hate speech, and provided one example:
Our mistakes have caused a great deal of concern in a number of communities, including among groups who feel we act—or fail to act—out of bias. We are deeply committed to addressing and confronting bias anywhere it may exist. At the same time, we work to fix our mistakes quickly when they happen.
Last year, Shaun King, a prominent African-American activist, posted hate mail he had received that included vulgar slurs. We took down Mr. King’s post in error—not recognizing at first that it was shared to condemn the attack. When we were alerted to the mistake, we restored the post and apologized. Still, we know that these kinds of mistakes are deeply upsetting for the people involved and cut against the grain of everything we are trying to achieve at Facebook.
What are Facebook’s next moves? Allan wrote:
People often ask: can’t artificial intelligence solve this? Technology will continue to be an important part of how we try to improve. We are, for example, experimenting with ways to filter the most obviously toxic language in comments so that they are hidden from posts. But while we’re continuing to invest in these promising advances, we’re a long way from being able to rely on machine learning and AI to handle the complexity involved in assessing hate speech.
That’s why we rely so heavily on our community to identify and report potential hate speech. With billions of posts on our platform—and with the need for context in order to assess the meaning and intent of reported posts—there’s not yet a perfect tool or system that can reliably find and distinguish posts that cross the line from expressive opinion into unacceptable hate speech. Our model builds on the eyes and ears of everyone on platform—the people who vigilantly report millions of posts to us each week for all sorts of potential violations. We then have our teams of reviewers, who have broad language expertise and work 24 hours a day across time zones, to apply our hate speech policies.
We’re building up these teams that deal with reported content: over the next year, we’ll add 3,000 people to our community operations team around the world, on top of the 4,500 we have today. We’ll keep learning more about local context and changing language. And because measurement and reporting are an important part of our response to hate speech, we’re working on better ways to capture and share meaningful data with the public.
Image courtesy of grinvalds/iStock.