Welcome to part 3 of Planet Ghibli! This time we’re getting political.

As we have seen in Part 1 and Part 2, machine learning (or “AI” in mainstream media) is time- and resource intensive. This because first of all you need data, which may or may not take up much of your time, because maybe you use an existing dataset. Another is picking an algorithm, where it may not be obvious which one is most suitable for your use case. Then, depending on your intent there is an unknown amount of time taken to train your model (run all the data through the algorithm), evaluate the result, and perhaps try again a bunch of times before you’re happy with the output. As we saw in the Part 2 discussion, this phase can be tricky because outputs may vary enormously depending on the methods used.

With complicated use cases, like playing video games or generating text, we need huge amounts of data and huge architectures (algorithms). This quickly becomes so expensive that only huge companies are able to afford to train such systems. Take for example the text-generation system GPT-3 developed by OpenAI. It can do some amazing things like generate code from text and create novel solutions. Without going into mathematical details, it’s one of the largest machine learning systems ever released and used what one article states is the fifth most powerful supercomputer in the world, with hundreds of thousands of processors and a cost of at least $1 billion. I’m not sure how much trial and error was needed, probably some, but also much less than I personally would need because these researchers are experts in their field after all.

If you consider the amount of work we put into getting nice looking pictures for our Planet Ghibli style transfer, and the interaction between input and output; that was fairly time consuming right? And it was time consuming despite us having access to resources in the form of a graphics card. So with time and money becoming prohibitively expensive for individuals or even small organisations; what do you think will be the effect on society if we as regular citizens cannot investigate these systems ourselves? Is it a good thing we have to trust huge companies that they’ve made these systems fair?

In their paper on GPT-3, the authors do discuss how their language model reflects biases in society, how it can be misused and the energy costs of training versus use when training is done. I want to say they should be lauded for including this, but at the same time such discussions should be the default. Nevertheless, it’s great they have that discussion as it shows they’ve thought about these issues and are (hopefully) open to further discussion about them. What about when the company is not transparent?

If we have no possibility of investigating for ourselves the interaction between data, algorithms and their combined judgments; how can we trust automated systems that decide who will get jobs, loans, medical treatments and court sentences? The short answer is: we absolutely cannot.

Why shouldn’t we trust automated decision making?

“deep learning: computational depth without historical or sociological depth is superficial learning”
– Ruha Benjamin

Because these systems are trained on existing data, they can only reinforce existing stereotypes. For example a system trained on predicting future crime is futile. Since such a system only relies on past data, that past data is usually biased towards police already being more aggressive and making more arrests in poor areas and towards minorities. So a predictive system will continue to predict that rich white people do not commit crime. And I hope you agree with me that black lives matter.

Compare the just mentioned example of crime prediction to the different outputs we got when doing style transfer in the previous parts of this blog series. We can only change the output by changing the data we use as input, and in some cases entire groups of elephants disappeared from the output. As consumers of a pre-made system, we only see the final output so without insight we will never know what data is being considered and how it’s used.

Traditionally, if you applied for a bank loan, there was a person you could ask about why you can’t get one. No matter what we may think of an economic system that builds on debt, knowing why you can’t get that loan means you’ll know what to do before trying again. With an opaque black box there is no why. And such systems are increasingly being used for job applications, bank loans and insurance.

The danger here is that if we do not stop and think, we run the risk of these systems further reinforcing existing biases and stereotypes. Just a few examples are how facial recognition can’t distinguish between people of color, facebook shows job ads in a discriminatory fashion, any system that uses binary gender data is transphobic and prediction of child abuse punishes poor families. A huge issue here, and incidentally the same reason why self driving cars still don’t work, is that most often these systems can’t reliably catch outliers or handle situations they have never seen before. In other words; minority groups run the risk of disappearing as the data is averaged.

Shifting the power

Pratyusha Kalluri gives us this insightful thought:

“It is not uncommon now for AI experts to ask whether an AI is ‘fair’ and ‘for good’. But ‘fair’ and ‘good’ are infinitely spacious words that any AI system can be squeezed into. The question to pose is a deeper one: how is AI shifting power?”

In most of the discussed cases, I would like to believe the consequences were not intended by the researchers and engineers. It’s rather a result of software engineering teams not being diverse, existing datasets not being diverse and that software engineers are trained in maths and programming but not in ethics, diversity and societal impact. So this discussion needs to be taken seriously. And what’s more concerning here is that highly influential companies and people in the machine learning world seem to not fully understand this discussion and/or grasp their own role.

Such as when it was found in 2015 that a Google algorithm tagged photos of black people as gorillas. Two and a half years later, their solution was to remove gorillas from the possible tags. With all the data Google/Alphabet has available and the resources to gather, together with their hundreds or even thousands of brilliant employees with PhDs in computer science; was this really the best solution they could find? Or is it a matter of not caring to allocate resources to increase equality?

Another more recent example is how a system trained to depixelate faces had the effect of turning Barack Obama into a white man. Yann Lecun - receiver of the Turing award for his work in deep learning and currently Vice President and Chief AI Scientist at Facebook - replied that this was a consequence of the data set used. Use another data set and the problem goes away. He was criticised by Timnit Gebru for this statement being overly simplified because the problem is bigger than that; it’s about why this sort of thing keeps happening and why did the authors not even think of trying their system with a diverse set of skin colours? Lecun refused to see the bigger picture and kept defending his position from a technical view. He later quit Twitter and posted a long statement on facebook about his personal values of freedom and justice.

To connect this drama to our experimentation with style transfer, it’s a little like if we had stopped testing with the Princess Mononoke input picture and concluded that we would get a different result with a different input picture. It’s technically correct, but fails to consider what our purpose is and how our methods of data collection and algorithm selection feeds into that purpose.

Furthermore, unless you’re a researcher it may be difficult to evaluate if the algorithm works for the dataset you’re using. Perhaps the data you have would need algorithmic tweaks that the researchers never thought of testing for. But developers use off-the-shelf algorithms all the time, they don’t have the knowledge and/or time to make their own.

The authors of the paper in question have since added a section about biases, and some would perhaps criticise them for doing that only after discussion happened. But to be fair, that’s how science should work; issues are pointed out and through discussion we move forward. This is where Lecun’s answer becomes problematic in my view. He’s technically correct about the data set being an issue, and you and I can see how he’s correct about that, using our our style transfer as a visual simile. But Lecun seems incapable of looking beyond mathematics, defending his statement instead of being curious, and feeling so hurt for being called out that he completely quits Twitter. When the head of AI for Facebook cannot handle a discussion about issues of bias in AI/ML and saying it’s only about data, it’s obvious he doesn’t understand the larger issue. Not to mention it says something about the company culture within Facebook.

The sad consequence for everyone is that the outlook of using AI to shift power - to empower the disadvantaged - instead of reinforcing existing inequalities, is so much bleaker when the world’s most influential companies act indifferent.

“Most people are forced to live inside someone else’s imagination”
– Ruha Benjamin

So what can we do?

As software developers, ML/AI practitioners and data scientists, it’s essential to gain a broader perspective than mathematical formulas. We have to be able to put the artefacts we create into a broader sociological and historical context. So learning more about those contexts is one action. Another action is to hire diverse teams. Not just diverse based on race or gender but also of perspective. Hire historians and sociologists. It doesn’t matter they won’t understand the details of your algorithms, you can explain the concepts so they can be valuable contributors anyway. And if you happen to not be great at explaining things, hire someone who can! A good educator can also help you be transparent by explaining to your customers how your product works.

As citizens of the world, we have an obligation to not accept the status quo. Human rights is not a done deal, we have to keep pushing for better transparency, inclusiveness and justice. For ourselves, for our fellow humans, for the life we share this planet with and for Mother Earth herself. A very important part of that is to listen. Listen to the people saying they are mistreated. Understand where they are coming from. Only then can we understand how our own actions may impact the greater picture of systemic inequalities.

My own understanding of these issues is only superficial at this point. Thanks to #BLM, XR, Laurie Penny, Timnit Gebru’s recommendations and many others I’ve started to see how many different kinds of inequality are connected. But my to-read list1 is still way longer than my already-read-list and there’s much I do not know.

Among the amazing people I’ve found there’s one person in particular I want to promote. Ruha Benjamin gave a brilliant keynote at ICLR this summer. I really, really, really recommend you watch her keynote. And the Q&A.

So I will end this rant essay with yet another quote from Benjamin, one I find particularly inspiring:

“Remember to imagine and craft the worlds you cannot live without, just as you dismantle the ones you cannot live within.”