Module #2 - Learning From Data

Module #2 is based on Competency 12:

  • Competency 12: Understand that learning happens from data

The AI literacy paper defines competency 12 as follows:

Research suggests that it is important for learners to understand that computers learn from their data [68,130]

The paper admits that there are no real best practices for teaching machine learning to non-technical learners:

little research has explored how to teach ML, which arguably has more in common with scientific practice in disciplines like chemistry or physics than deterministic approaches to AI in cognitive systems and robotics [117].

The paper points out two emergent strategies:

  1. communicating more high-level practices such as data gathering and preparation, model selection, training, testing, and prediction
  2. hands-on experimentation, e.g., [4,146] - projects in which learners can train ML models to analyze their athletic moves and gestures.

The current implementation is a combination of both of these strategies. The module’s overall focus is on one specific form of machine learning, classification, a supervised learning concept that categorizes a set of data into classes. We chose classification, specifically, binary classification, as this is one of the more accessible, explainable ML techniques. The module provides two examples of classification: first in the video lecture through the predictive scenario of a smart light, then in a group activity involving credit card fraud detection.

The video reflects strategy #1: it illustrates the high-level practices of developing a classification algorithm using a scenario of a smart light. The video covers data collection, feature selection, labeling, training, testing, and finally predicting. In addition to these practices, the video engages two important misconceptions that non-technical learners commonly hold regarding ML mentioned in the AI lit paper:

  1. Students are often surprised that ML requires human decision-making and is not entirely automated
  2. Students often have difficulty identifying the limits of ML and identifying constraints that may make ML unsuitable for solving a particular problem

The video engages #1 by constantly emphasizing the role of the human in the process: it stresses “WE” - “we” tested” “we” trained “we” experimented. The video engages #2 by noting the constraints of ML: limited accuracy, broken rules, false positives, uncertainty, etc. These two points of emphasis should help learners understand the limits of ML and the vital role of human decision-making.

Following the video, the activity invites learners to design a classifier for detecting credit card fraud. This activity reflects the AI literacy paper suggestions for hands-on experimentation. We use pseudo-coding activity based on the common experience of fraudulent credit card transactions to reinforce the concepts (features, rules, classification) from the video. The key part of the activity is identifying the data necessary for fraud prediction: the input features, the significance of said features, and then where these features can be found. By forcing the learners to reason about these decisions, the activity allows the learners to understand the constraints and limitations of machine learning. The post-activity discussion pushes learners to discuss and compare their classifiers. Much like the cooking activity in module one, this discussion emphasizes the human, subjective process of ML: “what did YOU want to achieve with YOUR classifier?”

To reinforce this point, the module closes with a reflection on the importance of human judgment in ML. It thereby furthers the goal of engendering agency (i.e., machines learn what WE teach them to learn).


Previous submodule:
Next submodule: