How One Agency Is Targeting Online Hate Speech Using Artificial Intelligence

Interested in learning more about AI? You should attend Adweek’s Elevate: AI summit March 6 in New York. Register here. Hate speech isn't a new problem, but in the social media age, the rate at which it can be spread has exponentially increased. But can artificial intelligence help stop it? Agency Possible and its longtime partners at social media marketing software company Spredfast wanted to do something to slow the rate at which hate can spread online. That's why they launched WeCounterHate, a campaign which "counters" such messages on Twitter with a donation to a nonprofit organization fighting hate for every retweet. The campaign focuses on influencers with the largest number of retweets. Creative technologist Shawn Herron inspired the idea with a text to executive creative director Ray Page about how residents of Bavarian town Wunsiedel organized a group to turn an annual neo-Nazi march into a fundraising effort for anti-extremist organization EXIT Deutschland. He brought up the question of how Possible could apply the approach on a wide scale, and the agency settled on Twitter as a target platform. Spredfast CMO Jim Rudden described Twitter as "the tool of the moment to reach broad audiences and get messages out there. Any digital platform built in that context can be used for good or ill." [kickout id="1"] "In addition to the rules that are being set up by the platforms themselves, there are also social norms … it's up to the people who use the platforms to set those up and we want to be at the forefront of doing that," added Possible global chief data officer Jason Carmel. "That's when we reached out to Life After Hate, because of the way they approach reforming and trying to remove people from these extremist groups," Page explained. "It felt like their sensibilities matched with ours and what we wanted to do with the platform." Life After Hate is a nonprofit founded in 2011 that offers community education, helps individuals exit hate groups and provides support for family and friends. The group was awarded a $400,000 grant during the last days of the Obama administration; the grant was rescinded by the Trump administration, and Life After Hate was not on the list of organizations receiving funding to fight extremism under the new administration's review. Life After Hate executive director Sammy Rangel told Adweek that since the August white nationalist rally in Charlottesville that ended in violence, demand for Life After Hate's services has "increased tenfold" and community support for has escalated. Rangel added WeCounterHate has emerged "in the shadow" of a rise in hate speech following Trump's election. "Hate speech has been much more outspoken … emboldened, visible and brazen," he said, adding that's something he attributes to "the way in which the messaging coming from this administration is being interpreted by the far right." One of the biggest challenges for Possible was programming the ability to identify hate speech. The team used Spredfast to find different kinds of hate speech, program that into an AI engine to teach it what hate speech is and classify tweets based on how hateful they were. Initial attempts proved unsuccessful, mostly because of how they defined such speech, focusing on "the worst words [someone could] call somebody who belongs to a different group," Carmel said. "What we learned when we searched for that is: Hate speech has evolved," he added. "People are still saying radically hateful things, but they're using different language for it." Possible realized they weren't going about identifying hate speech in a sophisticated enough manner. Life After Hate offered to help, and the two groups worked with four former far right extremists to program the WeCounterHate engine, expanding the scope of what they were teaching the program. "We knew when we had that first conversation with Sammy [Rangel] that we needed each other," Page explained. We had the technology and the capability, but we needed some insight to help us build a stronger engine around it … The machine has been getting smarter ever since." The response has exceeded their expectations. WeCounterHate estimates that around 20 percent of hate influencers end up deleting targeted tweets, with the number of retweets shrinking over 50 percent. "I think what we have figured out is a way to insert a narrative on a platform that has typically been ungoverned," Rangel said. "At the very least, we're causing a moment to pause, a moment to reflect. People are going to have to second-guess whether they want to push that button."