Facebook: Hate Speech Accounts for 10 or 11 of Every 10,000 Pieces of Content Viewed

Facebook began measuring the prevalence of hate speech on its platform globally in the November 2020 edition of its Community Standards Enforcement Report, pegging it at 0.10% to 0.11%, or 10 to 11 views of hate speech for every 10,000 views of content. Vice president of integrity Guy Rosen said in a Newsroom post Thursday that Facebook’s investments in artificial intelligence have helped it to proactively detect hate speech content on its platform before it is reported by users. Rosen said during a press call Thursday, “Think of prevalence like an air quality test to determine the percentage of pollutants.” Facebook product manager for integrity Arcadiy Kantor went into further detail in a separate Newsroom post, explaining that the social network calculates prevalence by selecting a sample of content that was seen on Facebook and labeling how much of that content violates its hate speech policies. In order to account for language and cultural context, these samples are sent to content reviewers across different languages and regions. Kantor pointed out that the amount of times that content is seen is not evenly distributed, writing, “One piece of content could go viral and be seen by lots of people in a very short amount of time, whereas other content could be on the internet for a long time and only be seen by a handful of people.” He also detailed the challenges of determining what constitutes hate speech, writing, “We define hate speech as anything that directly attacks people based on protected characteristics including race, ethnicity, national origin, religious affiliation, sexual orientation, sex, gender, gender identity or serious disability or disease,” but adding, “Language continues to evolve, and a word that was not a slur yesterday may become one tomorrow. This means content enforcement is a delicate balance between making sure we don’t miss hate speech while not removing other forms of permissible speech.” Facebook continues to use a combination of user reports and AI to detect hate speech on Facebook and Instagram, and Kantor addressed the challenges the company faces with the human part of that equation, such as people from areas with lower digital literacy not being aware that they can report content, or people reporting content that they don’t like, but that doesn’t violate Facebook’s policies, such as TV show spoilers or posts about rival sports teams. As for AI, he wrote, “When we first began reporting our metrics for hate speech, in the fourth quarter of 2017, our proactive detection rate was 23.6%. This means that of the hate speech we removed, 23.6% of it was found before a user reported it to us. The remaining majority of it was removed after a user reported it. Today, we proactively detect about 95% of hate speech content we remove. Whether content is proactively detected or reported by users, we often use AI to take action on the straightforward cases and prioritize the more nuanced cases, where context needs to be considered, for our reviewers.” Those content moderators may beg to differ, however, as an open letter sent Wednesday by over 200 of them blasted Facebook’s AI systems, saying, “Management told moderators that we should no longer see certain varieties of toxic content coming up in the review tool from which we work—such as graphic violence or child abuse, for example. The AI wasn’t up to the job. Important speech got swept into the maw of the Facebook filter—and risky content, like self-harm, stayed up. The lesson is clear. Facebook’s algorithms are years away from achieving the necessary level of sophistication to moderate content automatically. They may never get there.” Chief technology officer Mike Schroepfer said during the press call that when Facebook’s AI systems get better, it catches things faster, instead of waiting for someone to report it, and those AI systems can amplify the work of human reviewers by catching content in all examples and all variations so that reviewers can “spend time and energy working on the more subtle things that are hard for even our most advanced AI to detect.” Rosen said during the press call that from March 1 through Election Day, over 265,000 pieces of content were removed from Facebook and Instagram in the U.S. for violating the social network’s voter interference policies, and warning labels were displayed on more than 180 million pieces of content on Facebook. In addition, 140 million people visited the Voting Information Center across Facebook’s platform, including over 33 million on Election Day alone. As for Covid-19 misinformation, more than 12 million pieces of content were removed from Facebook and Instagram from March through October, while warning labels were displayed on over 167 million pieces of content on Facebook. Rosen said some enforcement metrics on Facebook and Instagram have returned to pre-pandemic levels, despite the disruption Covid-19 has caused to the company’s workforce. During the third quarter of 2020, Facebook took action on: 22.1 million pieces of hate speech content, about 95% of which were proactively identified.19.2 million pieces of violent and graphic content, up from 15 million in the second quarter of 2020.12.4 million pieces of child nudity and sexual exploitation content, up from 9.5 million in the second quarter.3.5 million pieces of bullying and harassment content, up from 2.4 million in the second quarter. As for Instagram, action was taken on: 6.5 million pieces of hate speech content, up from 3.2 million in the second quarter of 2020, about 95% of which was proactively identified (up from about 85% in the second quarter).4.1 million pieces of violent and graphic content, up from 3.1 million in the second quarter.1.million pieces of child nudity and sexual exploitation content, up from 481,000 in the second quarter.2.6 million pieces of bullying and harassment content, up from 2.3 million in the second quarter.1.3 million pieces of suicide and self-injury content, up from 277,400 in the second quarter. Schroepfer penned a Newsroom post of his own to explain how Facebook’s AI tools work. The social network built and deployed a new reinforcement learning framework, Reinforced Integrity Optimizer, which uses real-world data from Facebook’s production systems to optimize its models so that rather than remaining static, they account for variables such as sarcasm, slang, text/image combinations or images that are perceived differently in different cultures, regions and languages. A mechanism called Transformers then teaches Facebook’s AI systems which parts of text are the most important to focus on. The efficiency of these Transformers is boosted by a new architecture developed by the company, Linformer, which Schroepfer said “provides a vastly more efficient way to use massive, cutting-edge models to understand content.” And for cases such as combinations of images and text that may make posts offensive, Facebook developed Whole Post Integrity Embedding, a pretrained universal representation of content for integrity problems. Finally, the social network deployed XLM-R, a model that leverages its RoBERTa AI architecture, to improve its hate speech classifiers in multiple languages across Facebook and Instagram. Rosen detailed updates to Facebook’s community standards, many of which have been previously revealed, writing, “For example, our policy that prohibits posting misinformation and unverifiable rumors that contribute to the risk of imminent violence or physical harm, and our policy to add a warning label to sensitive content such as imagery posted by a news agency that depicts child nudity in the context of famine, genocide, war crimes or crimes against humanity. While these policies are not new, we are sharing more details today to be even more transparent about our enforcement practices.” Vice president of content policy Monika Bickert said during the press call that the company’s policies are crafted by a team of 200 people in 11 offices around the world, working with hundreds of organizations and experts globally. Schroepfer concluded, “While we are constantly improving our AI tools, they are far from perfect. We know that there’s more work to be done, and we are building and testing new systems to help us do more to protect the people who use our platform … Detecting hate speech is not only a difficult challenge: It’s also a constantly evolving one … With more flexible and adaptive AI, we’re confident that we can continue to make real progress.”