Simple Explanation of Machine Learning in Google’s Algorithm(s)

machine learning in Google Search

In SEO circles we talk a lot about Machine Learning. What is the actual process involved? Here is a simple 1 minute explanation that helps demystify!

The Three Ingredients of Machine Learning

  1. data;
  2. maths;
  3. intuition.

The Three Steps of Machine Learning in Google Search

  1. features;
  2. human labelled data;
  3. learning.

Human-Defined Features for the Machine Learning Algorithms

The Google engineer tells the Machine Learning Algorithm which are the factors (technically called features) that they think are important, and provides strict rules as to what is considered success and failure.

Feeding the Machine Learning Algorithms with Human Labelled Data

The Google engineer then feeds the machine with a vast number of different human-labelled examples of good and bad results for a range of different case scenarios (for the Blue Link Algorithm that is search queries, for the Knowledge Vault Algorithm that is “fact”).

The Machine Learning Algorithms Adapt and Improve

The machine then figures out the different weights for the features that will provide quality results in any circumstance, whatever the input (ie even for new examples the machine has never seen before). The Machine Learning Algorithm runs and provides live results.

But that is just the setup!

The Machine Learning Cycle

This is a continuous process. Google (and Bing) are continuously giving the algorithms feedback so they can improve themselves. This feedback is in the form of labelled data – either corrective (when the machine got the result wrong) or reinforcement (when the machine got the result right).

After step 3 set out above, this three step process runs as a continuous cycle:

  1. assess and label the results;
  2. engineers tweak the algorithm;
  3. the algorithm is fed labelled data.

Human Quality Raters Assess and Label the Data

Google has teams of human quality raters (Bing calls the judges) who assess the results and label them “success” (reinforcement) or “failure” (corrective).

Google’s Quality Rater Guidelines

Importantly, the concept of success or failure is granular and the exact nature of the labelling depends on the case scenarios (context). Google do not publicly share the exact annotations or specific rules that represent success and failure.

Google and Bing also receive feedback through the SERP:

Engineers Tweak the Algorithm

The human labelled data provided by the quality raters (and also user feedback through the Google SERP) is used by the algorithm teams to tweak the features and the rules.

These changes will be made both when the engineers see a way to help the machine achieve success more efficiently or effectively, and also to adjust what “success” and “failure” look like as Google’s SERP offering evolves.

The Algorithms Learn Through Human Feedback

The labelled data is fed back to the machine: The negative feedback is used by the algorithms to adjust and improve (corrective learning). The positive feedback is a reinforcement for the algorithms’ learning (reinforcement learning). It sounds very much like human learning!

An Overview of Machine Learning in Google Search

A simple way to view this whole process is to see the algorithm as simply a measuring model… the model measures success and failure and adapts itself accordingly.

But crucially, humans play a central role. Machines don’t have free rein – the algorithm is built by humans who (through examples) provide a definition of right and wrong.

It is also humans who create and maintain the platform that defines which features are important… or not. Machine learning simply balances all the features to best satisfy that human judgement.

Machine learning in Google Search (illustration)
Machines are trained offline (on the left), then work in the real world of Search, then humans judge the results and the data is fed back into the offline training.
“Machines dancing with humans”, as Andrea Volpini from Wordlift says 🙂

How Fast Do Machine Learning Algorithms Improve

One word: exponentially.

Fabrice Canel (the Principal Product Manager for BingBot) explained to me that the algorithms at Bing are now almost completely Machine Learning and that they are improving exponentially.

Gary Illyes from Google stated that all Search Engines function the same way, so we can safely assume the same to be true at Google. They have:

  1. the same audience (humans searching for the solution to a problem);
  2. the same goal (providing the solution to that problem as efficiently as possible);
  3. the same technology stack.

The word “exponentially” strikes me as important. And probably underestimated by most of us. Exponentially improving means “improving at an accelerating rate”. The hockey stick analogy really doesn’t do this justice.

Machine Learning is Improving Exponentially in Search Algorithms (Illustration)
Although seeing this in the SERPs is difficult, the data in the Kalicube Pro database (billions of datapoints) provides a glimpse. And it is quite a scary sight!

How does Machine Learning in Google’s Search Algorithm(s) Fit into Brand SERP Optimisation and Knowledge Panel Management?

All Google’s algorithms use Machine Learning extensively. That means Machine Learning is critical to every aspect of Brand SERPs and Knowledge Panels, including:

Similar Posts