Simple Explanation of Machine Learning in Google’s Algorithm(s)
In SEO circles we talk a lot about Machine Learning. What is the actual process involved? Here is a simple 1 minute explanation that helps demystify!
The Three Ingredients of Machine Learning
- data;
- maths;
- intuition.
The Three Steps of Machine Learning in Google Search
- features;
- human labelled data;
- learning.
Human-Defined Features for the Machine Learning Algorithms
The Google engineer tells the Machine Learning Algorithm which are the factors (technically called features) that they think are important, and provides strict rules as to what is considered success and failure.
Feeding the Machine Learning Algorithms with Human Labelled Data
The Google engineer then feeds the machine with a vast number of different human-labelled examples of good and bad results for a range of different case scenarios (for the Blue Link Algorithm that is search queries, for the Knowledge Vault Algorithm that is “fact”).
The Machine Learning Algorithms Adapt and Improve
The machine then figures out the different weights for the features that will provide quality results in any circumstance, whatever the input (ie even for new examples the machine has never seen before). The Machine Learning Algorithm runs and provides live results.
But that is just the setup!
The Machine Learning Cycle
This is a continuous process. Google (and Bing) are continuously giving the algorithms feedback so they can improve themselves. This feedback is in the form of labelled data – either corrective (when the machine got the result wrong) or reinforcement (when the machine got the result right).
After step 3 set out above, this three step process runs as a continuous cycle:
- assess and label the results;
- engineers tweak the algorithm;
- the algorithm is fed labelled data.
Human Quality Raters Assess and Label the Data
Google has teams of human quality raters (Bing calls the judges) who assess the results and label them “success” (reinforcement) or “failure” (corrective).
Importantly, the concept of success or failure is granular and the exact nature of the labelling depends on the case scenarios (context). Google do not publicly share the exact annotations or specific rules that represent success and failure.
Google and Bing also receive feedback through the SERP:
Engineers Tweak the Algorithm
The human labelled data provided by the quality raters (and also user feedback through the Google SERP) is used by the algorithm teams to tweak the features and the rules.
These changes will be made both when the engineers see a way to help the machine achieve success more efficiently or effectively, and also to adjust what “success” and “failure” look like as Google’s SERP offering evolves.
The Algorithms Learn Through Human Feedback
The labelled data is fed back to the machine: The negative feedback is used by the algorithms to adjust and improve (corrective learning). The positive feedback is a reinforcement for the algorithms’ learning (reinforcement learning). It sounds very much like human learning!
An Overview of Machine Learning in Google Search
A simple way to view this whole process is to see the algorithm as simply a measuring model… the model measures success and failure and adapts itself accordingly.
But crucially, humans play a central role. Machines don’t have free rein – the algorithm is built by humans who (through examples) provide a definition of right and wrong.
It is also humans who create and maintain the platform that defines which features are important… or not. Machine learning simply balances all the features to best satisfy that human judgement.
How Fast Do Machine Learning Algorithms Improve
One word: exponentially.
Fabrice Canel (the Principal Product Manager for BingBot) explained to me that the algorithms at Bing are now almost completely Machine Learning and that they are improving exponentially.
Gary Illyes from Google stated that all Search Engines function the same way, so we can safely assume the same to be true at Google. They have:
- the same audience (humans searching for the solution to a problem);
- the same goal (providing the solution to that problem as efficiently as possible);
- the same technology stack.
The word “exponentially” strikes me as important. And probably underestimated by most of us. Exponentially improving means “improving at an accelerating rate”. The hockey stick analogy really doesn’t do this justice.
How does Machine Learning in Google’s Search Algorithm(s) Fit into Brand SERP Optimisation and Knowledge Panel Management?
All Google’s algorithms use Machine Learning extensively. That means Machine Learning is critical to every aspect of Brand SERPs and Knowledge Panels, including:
- Blue Link Algorithm
- Featured Snippet Algorithm
- Natural Language Processing Algorithms
- Video Boxes Algorithms
- Filter Pills
- The Whole Page Algorithm
- GoogleBot
- Knowledge Extraction Algorithm
- The Knowledge Panel Algorithms
- The Knowledge Vault Algorithm
- … and everything else !