An Image Depixelator That Whitewashes People of Color Has Opened Rifts in the AI Community

The debate is between engineers who blame datasets and others who say the problems are much deeper

The face depixelation tool being used on Barack Obama's face
An AI tool that depixelates faces has become a flashpoint in a debate about AI bias. @Chicken3gg/Twitter

Key insights:

The experimental tool seemed innocuous enough at first glance: Plug in a pixelated image of a face and it would generate a higher-resolution mock-up using machine learning.

But it wasn’t long after an independent programmer posted it to Twitter last week that other researchers started to notice a glaring flaw. When prompted with Barack Obama’s blurry likeness, it returned a white man’s face with little resemblance to the former president. Ditto for other prominent people of color such as U.S. Rep. Alexandria Ocasio-Cortez and actress Lucy Liu.

The unfortunate tendency has fueled a heated debate in the artificial intelligence field over racial biases in the industry and how they manifest in the software it produces. Some of the industry’s biggest names have weighed in.

It comes as organizations and industries everywhere grapple with issues of discrimination and diversity amid worldwide Black Lives Matter protests against racial injustice.

But even before this particular moment of reckoning, a growing movement of activists, academics and technologists have been highlighting the biases and lack of accountability pervasive in the black-box algorithms that have come to govern an increasing share of our daily lives.

Their position is largely that incidents like the whitewashing depixelator are a symptom of a larger problem rooted in a lack of diversity in the industry, a failure to consult social science fields for a nuanced understanding of racist societal systems and development processes that are unaccountable to people affected by the finished products.

Not everyone sees the issue this way. Earlier this week, Facebook Chief AI Scientist Yann LeCun argued in a Twitter thread that the problems inherent to the model can be traced back to biases in the dataset—in this case, an algorithm called Pulse that is built on a more widely used model from Nvidia called StyleGAN.

While LeCun conceded there are other points in the algorithm development process where bias might be introduced, he has pushed the view that datasets are the main culprits in the past, and it is a common rebuttal to people in the AI industry who say that more comprehensive changes are needed.

Many other prominent researchers pushed back against LeCun’s argument on Twitter. David Ha, a research scientist at Google Brain, pointed out that even if the dataset were vetted for this particular instance, the benchmarks that engineers use to gauge progress in the field are rooted in their own biased datasets.

Timnit Gebru, co-lead of the Ethical AI Team at Google, pointed Lecun to a talk from Google Brain research scientist Emily Denton that breaks down the various ways in which bias can infect specifically facial reconstruction tools, including an examination of what researchers consider to normative and “objective” in how data is framed.

But a chorus of agreement with LeCun from engineers, developers and anonymous Twitter handles shows these ideas aren’t universally accepted within the AI industry.

The reckoning within the field of AI has taken other forms in the past few weeks as well. Amazon, IBM and Microsoft all recently committed to ending or pausing their sales of facial recognition tech after widespread reports over the past few years of such algorithms being racially biased and in some cases leading to wrongful accusations.

Tools like the facial depixelator can serve as a reminder of how commonplace bias can be in AI models and why measures like these are only the start of fixing the problem.

@patrickkulp Patrick Kulp is an emerging tech reporter at Adweek.