Friday, July 1, 2022

Machine Learning: what it is, what it's not, and links to go deeper

Cassie Kozyrkov is an Artificial Intelligence (AI) / Machine Learning (ML) expert and evangelist at Google. She published a set of training sessions called Making Friends with Machine Learning on YouTube. They are excellent, all six hours of them. To dive deep, watch them.

This article is a summary of her introduction to Machine Learning presentation. 

To get a flavor for her style, lets contrast Wikipedia’s definition of ML with hers. “Machine learning is a field of inquiry devoted to understanding and building methods that ‘learn,’ that is, methods that leverage data to improve performance on some set of tasks.” Vs.  “Thing labeling with examples and truth-finding.”  She expresses complexity with simplicity.

Machine learning is an approach to having computers make an enormous number of small decisions.

  • It is fundamentally different from traditional computer programming
  • Artificial intelligence succeeds at very complicated tasks that programmers can’t write instructions for by hand
  • Think of it as automating the ineffable
Let’s compare the two approaches.

A traditional computer program accepts information input and processes it via statements executed in sequential, conditional, or iterative order. The program’s code is a human readable collection of statements, structures, and algorithms that automate a process. More simply, it is a human readable recipe for solving a problem.

In contrast, machine learning uses a lot of raw data and sophisticated math to generate algorithms. The implementation details of all of this are beyond human comprehension at anything but an abstract or theoretical level. They are beyond our comprehension because of the sheer volume of equations and because of their use of dimensions beyond length, width, depth, and time. We humans cannot practically problem solve in over four dimensions. But with computer automation, our math can.

Data Scientists and Engineers don’t code the generative algorithms, they select them from a finite number invented by researchers, for example Neural Networks, Random Forests, or Logistic Regression. They feed the generative algorithms mountains of data and consume enormous amounts of compute resources to generate the Machine Learning algorithms, called models, that are the computer programs they “train.”

Data Scientists and engineers train their Machine Learning models by providing huge data sets of labeled examples to the researchers’ generative algorithms. Here’s the general structure of the data:

  1. An “instance” is an example a.k.a. an observation; a row in a spreadsheet.
  2. A “label” is the answer, a.k.a. the target, the output, or the ideal output, for the example.
  3. A “feature” is something we know about the example a.k.a. a variable, a column of a spreadsheet.

The algorithms iterate over the instances, evaluating their features, trying an astronomically high number of variations until they, maybe, learn to discern the correct labels for not only their training data but also for data they have never seen, i.e., they didn’t train on. For example, the ability to recognize the sound of the words “hello there” spoken by 100 million unique voices.

More formally then, Machine Learning is an approach to making or computing many small decisions that involve algorithmically finding patterns in data and using these to make recipes that deal correctly with brand new data.

Machine Learning was first realized in code in 1952, and its theoretical birth, the invention of the neural network, occurred in 1949. But it only took off when compute and storage resources became fast enough and large enough to handle the massive processing requirements needed to generate useful models.

Despite the illusion created by talking to Alexa or Siri, the Machine Learning models we use daily are not conscious or ”alive” in any meaningful sense. Humans are required throughout. We curate the enormous labeled data sets of examples. We select and try many models. We verify the results. Over time even our successful models are imperfect and require additional human intervention to identify emergent mistakes and then regenerate the models to maintain their accuracy. If you are its owner, think of an ML model as a high-interest credit card that is never fully paid back.

Cassie's view is that the biggest problem in AI is using all the right math to answer the wrong question. “We must ask the right questions and solve the right problems. Machine learning will optimize whatever we give it. Give it the wrong thing and you’re in big trouble. Much like the genie in the lamp, the genie is not the problem; it is the wisher. And the wisher does not have to be malign. They can just be foolish and not think through the likely outcomes of their choice.”

The type of label needed drives the high level approach to machine learning: classification or prediction. 

  1. Binary classification: image recognition - cat/not cat.
  2. Multi-class: image recognition of a cat, dog, or weasel.
  3. Prediction of a numerical outcome $12.14, 10.67, etc.

Types of machine learning:

  1. Supervised learning: for any example you give the system, you have the correct label handy. Keywords: labeled data.
  2. Unsupervised learning is the search for patterns. You have data but no labels. Keywords: data mining and clustering.
  3. Semi-supervised learning is the blend of supervised and unsupervised where we have some, but not all, data labeled. Keyword: partial guidance.
  4. Reinforcement learning. Here, the system takes a sequence of actions towards a goal that leads to success or failure. For example, learning to play a game. Keywords: sequence of actions, reward/punishment, delayed feedback, system influences its environment (and inputs).

Cassie points out that reinforcement learning is really, really hard. It rarely works, but when it does its magic. Think of self driving cars. Their modeling is based on game algorithms.

So how do you know if machine learning might be a fit for your problem? We’ve already covered that the problem should be ineffable, meaning it’s not practical for a programmer to write code to solve it. The next question is: can you imagine what sort of decisions or labels would the machine learning system make for you? If you cannot answer that question, then stop. It’s too early. You’ll need to derive insight by analyzing the problem space using descriptive analytics first. 

Which brings us to a basic process description of Data Science:

  1. Use descriptive analytics to get inspired and discover a problem machine learning may solve for you. 
  2. Problem in hand, use machine learning to generate a recipe, i.e., use technology to create the model. 
  3. Test the efficacy of the model using statistics. Remember the Genie; decide wisely.

Links to Cassie’s videos:

  1. http://bit.ly/mfml_part1
  2. http://bit.ly/mfml_part2
  3. http://bit.ly/mfml_part3

Link to Cassie on Medium: https://medium.com/@kozyrkov

No comments:

Post a Comment

Please no spam, advertisements, or unrelated personal discussions.