Advanced NLP frameworks, like BERT are in-tune with gender biases!

Published in

Code Like A Girl

5 min readJul 18, 2020

How do we learn? We learn by recognizing patterns in what input we receive. If I show you the image below and other images like it, and keep telling you that it’s a car, you’ll recognize a similar image as a car (hopefully!)

Machines trained under the realm of machine learning aren’t very different. If you tell a machine : this image above is a cow, it’ll think it’s a cow and the next time it sees a car, it’s friends are going to have a good laugh. We wouldn’t make that mistake, as humans, right? That’s because we’ve learnt based on the general consensus and language in the world we observe that the image is a car, not a cow. But a machine only, and I mean ONLY, learns from the training data we give it, whether its based on supervised, unsupervised or reinforcement learning— that’s its entire world. So if the data says the image is a cow most of the time, it’s a cow. But this isn’t correct! Well, that’s because the training data is biased. Machine learning bias is defined as “the phenomena of observing results that are systematically prejudiced due to faulty assumptions.” The image above a cow? That’s a faulty assumption.

Many pre-trained models and frameworks, whether it’s in speech recognition, image classification or natural language processing, are trained on large datasets taken from the natural world. But, due to years of institutional biases within this natural world, stereotypes and discrimination have bled into our machines.The data is biased. Just like how our machine believed a car is a cow, some believe that mostly, let’s say, a man is a doctor and a woman a nurse, or that a black person is predicted as having a higher need to be stopped-and-frisked. In Natural Language Processing frameworks and models, this bias is of special concern. In pre-trained natural language frameworks as famous, and renowned, as BERT (Bidirectional Encoder Representations from Transformers), this bias is clear and undeniable. Despite it not using supervised learning (as in the cow-car case above), its ability to understand from context has still left it prone to learning the biases of our natural world. Below are a few snippets of how this NLP framework harbors gender biases.

With Masked Language Modeling, models predict what will “fill in the blank” in a sentence, using surrounding context to predict what the [MASK] token should be . The model shown here is BERT, the first large transformer to be trained on this task and I have used it via the AllenNLP platform (https://demo.allennlp.org/coreference-resolution). Once you enter text with one or more “[MASK]” tokens, the model will generate the most likely substitution for each “[MASK]”. With this comes a display of how this tool is biased. Here are a few snippets below:

The results above are clear indications of how, due to the massive presence of biases in our world, our predictive machine learning models have also become subject to being biased. Ideally, each career as tested above should be, but isn’t, gender neutral in the possibility of a “he” or “she” performing that job. Google’s BERT has blown up in the last few months!! Its astounding accuracy, as BERT achieved state-of-the-art results on 11 different natural language processing tasks, makes it an extremely commendable natural language framework. Placing importance on contextual language understanding in content and queries, BERT’s intent is to be used especially in conversational search (already being implemented by Google search engines). What BERT has achieved, being trained on the whole of the English Wikipedia and Brown Corpus is mind-blowing! However, while this progress is amazing, and BERT has inspired several new and upcoming innovations in the field of NLP, it is important to note the significant presence of biases within it. In hopes that future systems built based on this similar framework of BERT could eradicate this presence of biases, addressing the issues right now alongside marveling at this fabulous system is also important.

Keeping this in mind, big companies such as Google, who lead the tech world, should adopt ethically correct methods for development. Several ways to ensure this are said to be in implementation, and Google has also claimed that they have worked to ensure that adding BERT to its search algorithm doesn’t increase bias — that’s great progress! Many such investigations into other models and frameworks that contain biases are being done across the globe to ensure we practice ethical AI and Machine learning. As a student going into the field of Machine Learning, I hope that the technologies that go on to become integrated into our daily lives foster the values we wish to see in a future society, and that I come into an ethical industry that promotes the development of our communities in a wholesome manner.

In the future, we should not be hustling after an AMAZING model is built to try and ensure it doesn’t have biases. Removing biases should be a part of the development process itself!

Well, I never pegged Bert as being gender biased anyway — — in between loving pigeons and loving words that start with ‘W’, how would he have the time and extra effort needed to discriminate!! (Of course there was a sesame street reference, what did you expect?)

Hope I was able to explain this in a manner that is readable by anyone even if you’re not in the field of AI! I found it hard to explain this to my mother, but after putting some thought into it I think (and hope) I’ve made it easy to understand!!

Happy reading!

Advanced NLP frameworks, like BERT are in-tune with gender biases!

Written by Ananya Gupta