How We Build Human Bias Into Artificial Intelligence

-OpEd-

PARIS — When Amazon realized that its AI recruiting tool favored men, the company quickly shelved it. Back in 2016, a chatbot released by Microsoft turned into a sex-obsessed neo-Nazi machine in only 24 hours. These incidents, along with others, played right into the hands of all those who say there is too much AI in our daily lives.

But some researchers are looking at things from a different perspective: if AI makes such mistakes, it’s because it has been taught that way. That means we can also teach it to avoid such mistakes. How? By tracking down all the biases contained in the data the AI is fed when it learns.

Said data is only the result of our own biases, which have been at play for a long time. They are easy to find: consider the number of simultaneous appearances of “woman” and “nurse” in the same text compared to a possible proximity with “doctor.” It’s easy to imagine that computer scientists will be often referred to in masculine terms or close to the word “man.” But there is more: Does anybody know, for example, that the error rate for facial recognition can reach 35% for women with black skin, compared to 0.8% for men with fair skin?

We are the ones who teach the machines to be biased.

All this is due to the body of initial data available to train AI algorithms. The data available is AI’s Achilles heel. For example, social networks provide an abundant and cheap source of data. But the presence of fake news, hate speech and general contempt towards minorities and women that can be found there doesn’t bode well for AIs.

An experiment conducted on Twitter by a researcher at Swinburne University revealed that negative feelings were most often expressed against female leaders rather than male leaders.

Here’s another experiment we can all make. Just enter the keyword “president” or “prime minister” on Google Images: men are over-represented by 95%. But it’s not Google’s fault.

Biases can also be found elsewhere: Has anybody ever wondered why voice assistants or customer contacts in call centers, when they are handled by robots, have reassuring female voices?

It’s true, studies show that men and women prefer a female voice to speak to them. It’s more reassuring. It’s maternal. When you look at it closely, this preference becomes more refined, though not in the right direction: We prefer a male voice to talk to us about computers or cars, and a female voice for all things interpersonal.

AI learns its racial bias from its creators — Photo: Abyssus

Recently, manufacturers of intelligent automated personal assistants and connected speakers have adapted their algorithms to show less patience with the rude and harassing nature of users who sometimes vent their frustrations against machines. The thinking behind it is to try and avoid people venting against women on the street, because they have become too disinhibited by their experience with machines.

Amazon has reprogrammed Alexa to answer questions with an explicit sexual nature in a curt fashion. Google Home, meanwhile, has introduced the “Pretty Please” function, which adapts to the kind or unkind tone with which a user addresses it.

But Google Home has no conscience nor personality: It doesn’t actually care whether it’s talked to politely or not. Are you, at the end of the day, polite with your washing machine? Probably not always, especially when it’s broken. But it could affect the machine’s learning process. A user would only moderately appreciate it if his smart speaker only talked to him harshly.

Apple also offers its own Siri assistant in several versions: female or male voice, with different English accents. The voice still remains by default female, but we note that it can be male by default for Arabic, French, Dutch, and English (one wonders why).

Artificial intelligence has no bias like us.

It’s not enough to talk with different accents. Personal assistants and smart speakers need to understand everyone.

To do this, the companies that design them rely on a corpus of audio clips, speeches, and more. It’s easy to imagine that some groups in society are under-represented, such as low-income, rural, lower social classes that use the Internet less. Obviously, you’re not going to find them in the corpus.

One of these corpora, Fischer’s corpus, contains speeches by people whose mother tongue is not English, but we immediately see that they are under-represented. More amusingly, maybe, Spanish and Indian accents are already a little more represented than the various accents within Great Britain.

Artificial intelligence has no bias like us. We are the ones who teach the machines to be biased. The World Economic Forum believes that it will take until the next century to achieve true gender equality. Chances are that with AI, we might have to wait even longer.

*Charles Cuvelliez is a lecturer at the Université Libre de Bruxelles and director of the Brussels School of Engineering.