How Machines Learn

Suggested time: 25 min

“Machine learning” refers to the techniques that computers use to automatically find patterns in data. They then use patterns to draw inferences or make predictions. We call this learning from data. In this module, we’ll explore how machines learn from data and how people can critically interpret that data collection.


Watch “Learning From Data” (15 min)

A classifier is an algorithm that makes predictions about the world by assigning labels to observations. Very often, predictions made by classifiers are then turned into actions, and so they affect the world. We will now see a video in which we design a smart light classifier - an AI that decides whether to turn the light on or off. This AI will automate a task that is usually done by a human, and we will say that this AI works well if it is able to guess whether a human would have turned the light on or off.

As you watch the video, make notes of any words you don’t recognize or concepts that feel confusing.

Or view this video on YouTube.

Group discussion (10 min)

As we discussed in Module 1, data is the key to AI. Data is commonly used as another word for information. Oftentimes data is gathered in a specific format suitable for use on a computer (e.g., a spreadsheet), but it can come in many forms.

The first kind of data is the input. We call the input to the classifier (e.g., whether it’s dark outside and whether it’s bedtime) the features. When we design AI tools, we use our everyday intelligence to decide which features to use as input.

The second kind of data we discussed in Module 1 is the parameters of the algorithm. We did not discuss parameters explicitly in this video.

The third kind of data is the output. Our smart light classifier makes predictions about whether the light should be on or off. These predictions are the classifier’s output, and we call them outcome labels or labels. Like many AI tools, the smart light helps automate a task that was otherwise done by a human. By observing how humans performed this task in the past, we can decide whether the output of the classifier is correct or incorrect. That is, by comparing predicted labels to “true” labels we can quantify the classifier’s predictive accuracy.

The fourth kind of data is human judgment. We will discuss this kind of data during our wrap-up discussion.

As a group, discuss the following questions:

  • Are there any concepts or terms that came up in the video that feel confusing?
  • How does the machine “know” what it “knows?”
  • What similarities and differences do you see when you compare how machines learn with how humans learn?

Facilitator tip: For groups larger than 6, consider using breakout sessions (these work both virtually and in person) with smaller groups to create richer opportunities for participation.


Previous submodule:
Next submodule: