Module #4 - All About That Bias

Module #4 draws from Competency 13:

  • Competency 13: Be critical interpreting data

The paper describes this competency as follows:

Understand that data cannot be taken at face-value and requires interpretation. Describe how the training examples provided in an initial dataset can affect the results of an algorithm.

Our implementation of this competency focuses on the role of socio-political contexts of data collection and use. That is to say, we were less interested in the technical aspects of machine learning (as we were in module 2) and more in the socio-political conditions in which machine learning takes place. Specifically, the module engages with instances of bias and how it impacts machine learning. The AI literacy paper discusses bias as follows:

Bias/fairness: Most of the papers in the 2018 FAT ML conference focused on issues related to algorithmic bias (e.g. [118,121]). Algorithmic bias is often directly related to bias present in training datasets. Agents in-the-wild are also able to learn bias and bigotry from human users [99].

The module explores bias starting with the warm-up. Next, there is a substantial video on bias in AI. This video accomplishes several key points:

  • It provides prominent examples of bias in AI
  • It grounds bias in AI to human decision-making
  • It deepens the general definition of bias into three specific types: pre-existing, technical, and emergent
  • It ends by reminding that biased data is often a reflection of the world being reflected back through our machines

The video is followed by group discussions on bias in two specific social contexts: gender disparity and racism.

With the group activity, the focus on bias moves more directly to data, specifically on training data used in machine learning. The Stock-mart scanner activity in this section draws heavily from “Dirty Data, Bad Predictions.” This paper highlights the social justice issues that arise with systems that are built on data produced during documented periods of flawed, racially-biased, and sometimes unlawful practices and policies (e.g., “dirty policing”). These practices and policies shape the environment and the methodology by which data is created, which raises the risk of developing inaccurate, skewed, or systemically biased data (“dirty data”). Put simply, when machine learning systems are trained by such data, it is difficult to escape the legacies of the unlawful or biased practices that they are built on. The Stock-Mart scanner activity recreates this “dirty data, bad prediction” situation in the context of a fictional retail company Stock-Mart.

The company developed a machine-learning algorithm to help screen applicants for hiring. Stock-Mart trained this algorithm using data (photos) of previously “successful” employees: those who never missed a day of work, worked at the company for at least four years, and were promoted at least once. However, during the period when that data was collected, the company was discriminating against minority employees. This racial discrimination is visually evident in the folders of the training data. The folder with photos of previously successful employees is majority white, while the folder with photos that did not meet these criteria is majority people of color. Thus, this preexisting bias in the data is encoded into the algorithm and becomes a vicious cycle: the algorithm has learned whiter candidates are higher-priority interviews than people of color.

In practice, the activity does not reveal the pre-existing bias immediately. Rather, the scenario only reveals the successful criteria (never missed a day of work, worked at the company for at least four years, was promoted at least once). This information is initially withheld to recreate the black-box conditions of many real-world cases where the sociopolitical history is not evident from the outset. Thus, learners will encounter the folders of the training data and interact with the scanner themselves without knowing the structural conditions behind why and how the system performs. Again, this is intended to evoke a visceral experience that hopefully can be channeled to create a sense of urgency towards engagement with AI. After this experience, the facilitator reveals that stock-mart was discriminating against minority employees when they collected the data for the algorithm. Here, the facilitator should state (if the learners don’t already guess) that this pre-existing bias is the reason for the stark racial disparities in the data. Put simply, white employees more often met the criteria for success because of this history of discrimination. In turn, the algorithm has learned what success means based on a false connection between features (white skin) and labels (the successful criteria).

The group discussion questions after the activity both speak to competency 13:

  1. How might a system built on historical data amplify social inequalities?
  2. In addition to “never missed a day of work, worked at the company for at least four years, was promoted at least once”, What other criteria for prioritizing candidates might exist?

The first question speaks to the importance of the socio-political context of data: that data can not be taken at face value. The second speaks to the fairness parameters involved in training processes: criteria like the ones used in this activity can disadvantage some more than others. The closing conversation should consummate the competency by noting that similar situations occur with real-world machine learning applications (NYPD, Child Welfare Services, Public Schools, etc.). The broader contextualization should re-emphasize why WE need to be aware of data and how it is used and potentially misused. It should be noted that in many real-world cases, we aren’t able to look behind the curtain as we did here. We can do one thing (especially as it pertains to local government instances machine learning): demand that our institutions provide contextual explanations of datasets and their origins and ask them to answer the seven questions provided by D’Ignazio to critically interpret data.


Previous submodule:
Next submodule: