Friday, November 22, 2024

Google’s “Branded Search” Patent For Ranking Search Results

Must read

Back in 2012 Google applied for a patent called “Ranking Search Results” that shows how Google can use branded search queries as a ranking factor. The patent is about using branded search queries and navigational queries as ranking factors, plus a count of independent links. Although this patent is from 2012, it’s possible that it may still play a role in ranking.

The patent was misunderstood by the search marketing community in 2012 and the knowledge contained in it was lost.

What Is The Ranking Search Results Patent About? TL/DR

The patent is explicitly about an invention for ranking search results, that’s why the patent is called “Ranking Search Results.” The patent describes an algorithm that uses to ranking factors to re-rank web pages:

Sorting Factor 1: By number of independent inbound links
This is a count of links that are independent from the site being ranked.

Sorting Factor 2: By number of branded search queries & navigational search queries.
The branded and navigational search queries are called “reference queries” and also are referred to as implied links.

The counts of both factors are used to modify the rankings of the web pages.

Why The Patent Was Misunderstood TL/DR

First, I want to say that in 2012, I didn’t understand how to read patents. I was more interested in research papers and left the patent reading to others. When I say that everyone in the search marketing community misunderstood the patent, I include myself in that group.

The “Ranking Search Results” patent was published in 2012, one year after the release of a content quality update called the Panda Update. The Panda update was named after one of the engineers who worked on it, Navneet Panda. Navneet Panda came up with questions that third party quality raters used to rate web pages. Those ratings were used as a test to see if changes to the algorithm were successful at removing “content farm” content.

Navneet Panda is also a co-author of the “Ranking search results” patent. SEOs saw his name on the patent and immediately assumed that this was the Panda patent.

The reason why that assumption is wrong is because the Panda update is an algorithm that uses a “classifier” to classify web pages by content quality. The “Ranking Search Results” patent is about ranking search results, period. The Ranking Search Results patent is not about content quality nor does it feature a content quality classifier.

Nothing in the “Ranking Search Results” patent relates in any way with the Panda update.

Why This Patent Is Not The Panda Update

In 2009 Google released the Caffeine Update which enabled Google to quickly index fresh content but inadvertently created a loophole that allowed content farms to rank millions of web pages on rarely searched topics.

In an interview with Wired, former Google search engineer Matt Cutts described the content farms like this:

“It was like, “What’s the bare minimum that I can do that’s not spam?” It sort of fell between our respective groups. And then we decided, okay, we’ve got to come together and figure out how to address this.”

Google subsequently responded with the Panda Update, named after a search engineer who worked on the algorithm which was specifically designed to filter out content farm content. Google used third party site quality raters to rate websites and the feedback was used to create a new definition of content quality that was used against content farm content.

Matt Cutts described the process:

“There was an engineer who came up with a rigorous set of questions, everything from. “Do you consider this site to be authoritative? Would it be okay if this was in a magazine? Does this site have excessive ads?” Questions along those lines.

…we actually came up with a classifier to say, okay, IRS or Wikipedia or New York Times is over on this side, and the low-quality sites are over on this side. And you can really see mathematical reasons…”

In simple terms, a classifier is an algorithm within a system that categorizes data. In the context of the Panda Update, the classifier categorizes web pages by content quality.

What’s apparent when reading the “Ranking search results” patent is that it’s clearly not about content quality, it’s about ranking search results.

Meaning Of Express Links And Implied Links

The “Ranking Search Results” patent uses two kinds of links to modify ranked search results:

  1. Implied links
  2. Express links

Implied links:
The patent uses branded search queries and navigational queries to calculate a ranking score as if the branded/navigational queries are links, calling them implied links. The implied links are used to create a factor for modifying web pages that are relevant (responsive) to search queries.

Express links:
The patent also uses independent inbound links to the web page as a part of another calculation to come up with a factor for modifying web pages that are responsive to a search query.

Both of those kinds of links (implied and independent express link) are used as factors to modify the rankings of a group of web pages.

Understanding what the patent is about is straightforward because the beginning of the patent explains it in relatively easy to understand English.

This section of the patent uses the following jargon:

  • A resource is a web page or website.
  • A target (target resource) is what is being linked to or referred to.
  • A “source resource” is a resource that makes a citation to the “target resource.”
  • The word “group” means the group of web pages that are relevant to a search query and are being ranked.

The patent talks about “express links” which are just regular links. It also describes “implied links” which are references within search queries, references to a web page (which is called a “target resource”).

I’m going to add bullet points to the original sentences so that they are easier to understand.

Okay, so this is the first important part:

“Links for the group can include express links, implied links, or both.

An express link, e.g., a hyperlink, is a link that is included in a source resource that a user can follow to navigate to a target resource.

An implied link is a reference to a target resource, e.g., a citation to the target resource, which is included in a source resource but is not an express link to the target resource. Thus, a resource in the group can be the target of an implied link without a user being able to navigate to the resource by following the implied link.”

The second important part uses the same jargon to define what implied links are:

  • A resource is a web page or website.
  • The site being linked to or referred to is called a “target resource.”
  • A “group of resources” means a group of web pages.

This is how the patent explains implied links:

“A query can be classified as referring to a particular resource if the query includes a term that is recognized by the system as referring to the particular resource.

For example, a term that refers to a resource may be all of or a portion of a resource identifier, e.g., the URL, for the resource.

For example, the term “example.com” may be a term that is recognized as referring to the home page of that domain, e.g., the resource whose URL is “http://www.example.com”.

Thus, search queries including the term “example.com” can be classified as referring to that home page.

As another example, if the system has data indicating that the terms “example sf” and “esf” are commonly used by users to refer to the resource whose URL is “http://www.sf.example.com,” queries that contain the terms “example sf” or “esf”, e.g., the queries “example sf news” and “esf restaurant reviews,” can be counted as reference queries for the group that includes the resource whose URL is “http://www.sf.example.com.” “

The above explanation defines “reference queries” as the terms that people use to refer to a specific website. So, for example (my example), if people search using “Walmart” with the keyword Air Conditioner within their search query then the query  “Walmart” + Air Conditioner is counted as a “reference query” to Walmart.com, it’s counted as a citation and an implied link.

The Patent Is Not About “Brand Mentions” On Web Pages

Some SEOs believe that a mention of a brand on a web page is counted by Google as if it’s a link. They have misinterpreted this patent to support the belief that an “implied link” is a brand mention on a web page.

As you can see, the patent does not describe the use of “brand mentions” on web pages. It’s crystal clear that the meaning of “implied links” within the context of this patent is about references to brands within search queries, not on a web page.

It also discusses doing the same thing with navigational queries:

“In addition or in the alternative, a query can be categorized as referring to a particular resource when the query has been determined to be a navigational query to the particular resource. From the user point of view, a navigational query is a query that is submitted in order to get to a single, particular web site or web page of a particular entity. The system can determine whether a query is navigational to a resource by accessing data that identifies queries that are classified as navigational to each of a number of resources.”

The takeaway then is that the parent describes the use of “reference queries” (branded/navigational search queries) as a factor similar to links and that’s why they’re called implied links.

Modification Factor

The algorithm generates a “modification factor” which re-ranks (modifies) the a group of web pages that are relevant to a search query based on the “reference queries” (which are branded search queries) and also using a count of independent inbound links.

This is how the modification (or ranking) is done:

  1. A count of inbound links using only “independent” links (links that are not controlled by the site being linked to).
  2. A count is made of the reference queries (branded search queries) (which are given a ranking power like a link).

Reminder: “resources” is a reference to web pages and websites.

Here is how the patent explains the part about the ranking:

“The system generates a modification factor for the group of resources from the count of independent links and the count of reference queries… For example, the modification factor can be a ratio of the number of independent links for the group to the number of reference queries for the group.”

What the patent is doing is it is filtering links in order to use links that are not associated with the website and it is also counting how many branded search queries are made for a webpage or website and using that as a ranking factor (modification factor).

In retrospect it was a mistake for some in the SEO industry to use this patent as “proof” for their idea about brand mentions on websites being a ranking factor.

It’s clear that “implied links” are not about brand mentions in web pages as a ranking factor but rather it’s about brand mentions (and URLs & domains) in search queries that can be used as ranking factors.

Why This Patent Is Important

This patent describes a way to use branded search queries as a signal of popularity and relevance for ranking web pages. It’s a good signal because it’s the users themselves saying that a specific website is relevant for specific search queries. It’s a signal that’s hard to manipulate which may make it a clean non-spam signal.

We don’t know if Google uses what’s described in the patent. But it’s easy to understand why it could still be a relevant signal today.

Read The Patent Within The Entire Context

Patents use specific language and it’s easy to misinterpret the words or overlook the meaning of it by focusing on specific sentences. The biggest mistake I see SEOs do is to remove one or two sentences from their context and then use that to say that Google is doing something or other. This is how SEO misinformation begins.

Read my article about How To Read Google Patents to understand how to read them and avoid misinterpreting them. Even if you don’t read patents, knowing the information is helpful because it’ll make it easier to spot misinformation about patents, which there is a lot of right now.

I limited this article to communicating what the “Ranking Search Results” patent is and what the most important points are. There many granular details about different implementations that I don’t cover because they’re not necessary to understanding the overall patent itself.

If you want the granular details, I strongly encourage first reading my article about how to read patents before reading the patent.

Read the patent here:

Ranking search results

Latest article