Microsoft Study Argues That Language AI Researchers Must Do Better at Addressing Racism

The authors analyzed 146 papers and found that they often failed to account for structural problems

The Microsoft logo
Microsoft researchers found that academic efforts to address AI bias are often incomplete. Getty Images

Key Insights:

Despite a growing push in the artificial intelligence community to root out the human biases baked into many algorithms, a recent study from Microsoft researchers found that efforts to address these problems have often failed to account for the true nature and scope of structural racism.

The authors analyzed 146 recent papers published on bias built into natural language processing AI models and found that they often failed to sufficiently define bias or the relationship between language and entrenched societal hierarchies. They also said that most of the quantitative measures proposed to measure bias in AI do not account for existing social research outside of technical fields.

“Although these papers have laid vital groundwork by illustrating some of the ways that [natural language processing] systems can be harmful, the majority of them fail to engage critically with what constitutes ‘bias’ in the first place,” the authors wrote. “Papers on ‘bias’ in NLP systems are rife with unstated assumptions about what kinds of system behaviors are harmful, in what ways, to whom and why.”

The authors split the papers they examined by their stated goal, including categories like stereotyping or allocation of resources, such as a system more likely to hire a person of a certain race. But nearly half the time, they found that the motivations of the research were too vague and the correlations they outlined were questionable.

“These examples leave unstated what it might mean for an NLP system to ‘discriminate,’ what constitutes ‘systematic biases’ or how NLP systems contribute to ‘social injustice’ (itself undefined),” they wrote.

The study comes after research group OpenAI—which Microsoft has invested $1 billion in—recently released a paper on the latest version of its ultra-powerful language generation AI, GPT-3 that included a section on the system’s racial and gender bias shortcomings. The researchers found that the tool, trained on nearly 1 trillion words scraped from around the internet, is more likely to include mentions of Black people in more negative sentiments than other racial groups and more likely use male identifiers, especially in occupational contexts.

A growing movement of academics, technologists and activists have long pushed for a more cross-disciplinary approach to studying AI that not only identifies biases built into the systems but also taps research from social science and other fields to understand them. Top universities like Stanford, MIT and New York University have all established research centers dedicated to a multidisciplinary study of AI.

As for addressing the problems they outline, the researchers recommend those taking on bias should “articulate their conceptualizations of ‘bias’ … and center work around the lived experiences of members of communities affected by NLP systems, while interrogating and reimagining the power relations between technologists and such communities.”

@patrickkulp Patrick Kulp is an emerging tech reporter at Adweek.