Facebook open-sourced two tools that it developed to detect identical and nearly identical photos and videos, which it uses in its efforts to keep harmful content off its platform.
Global head of safety Antigone Davis and vice president of integrity Guy Rosen said in a Newsroom post that the two technologies that are being open-sourced— PDQ and TMK+PDQF—are among the suite of tools it uses to detect content such as child exploitation, terrorist propaganda and or graphic violence.
PDQ and TMK+PDQF will be made available via GitHub for use by industry partners, smaller developers and nonprofits, enabling them to more easily identify harmful content and share its digital fingerprints, which are known as hashes, to help detect and remove duplicates of that content.
Davis and Rosen said PDQ was inspired by Microsoft’s pHash algorithm, although it was developed by Facebook from the ground up, while video-matching technology TMK+PDQF was developed by the social network’s artificial intelligence research team and academics from the University of Modena and Reggio Emilia in Italy.
They wrote, “These technologies create an efficient way to store files as short digital hashes that can determine whether two files are the same or similar, even without the original image or video. Hashes can also be more easily shared with other companies and nonprofits. For example, when we identify terrorist propaganda on our platforms, we remove it and hash it using a variety of techniques, including the algorithms we’re sharing today. Then we share the hashes with industry partners, including smaller companies, through GIFCT (the Global Internet Forum to Counter Terrorism) so they can also take down the same content if it appears on their services.”
National Center for Missing and Exploited Children president and CEO John Clark added in the Newsroom post, “In just one year, we witnessed a 541% increase in the number of child sexual abuse videos reported by the tech industry to the CyberTipline. We’re confident that Facebook’s generous contribution of this open-source technology will ultimately lead to the identification and rescue of more child sexual abuse victims.”