Google’s Knowledge Extraction Algorithm – What You Need to Know
What is Google’s Knowledge Extraction Algorithm?
The Knowledge Extraction Algorithm attempts to structure data collected by GoogleBot so that it can annotate the information before putting the information in Google’s Web Index.
The annotation indicates what the information is – an image, a video, a heading, a data table, a paragraph, and aside, a menu etc etc.
Additionally, the algorithm attributes score that indicates its level of confidence in the annotation it has given.
The vast majority of data on the web is unstructured, so much of the time, the Knowledge Extraction Algorithm needs to make a best guess and give the information structure. The annotations for this type of information will have a low confidence score.
Some data is structured using Schema.org markup, HTML tables, lists, headings and other structure techniques. The annotations for this type of information will have a high confidence score.
The annotations are used by the ranking algorithms and other Knowledge Algorithms to access the information. The confidence scores are used by those algorithms to assess the reliability of the annotation (and so the confidence score therefore affects the decision-making of the algorithms).
In the Knowledge Panel Course in the Kalicube Academy we cover how this Knowledge Extraction Algorithm functions and strategies you can use to help it understand your content.