We finally (sort of) know how Google Search works

Poulomi Chatterjee Poulomi Chatterjee | 06-13 16:11

On May 5, Rand Fishkin, the CEO of marketing research firm, SparkToro and SEO expert, received an anonymous email making the wild claim of having access to API documents of Google’s Search algorithm. Given how secretive Google is about how its Search mechanism works, Mr. Fishkin was immediately sceptical of these extraordinary claims. After exchanging several emails between them, Mr. Fishkin spoke to the emailer over video call on May 24. Four days later, the source disclosed his identity. Erfan Azimi was the founder of a digital marketing agency and a SEO practitioner himself and had plenty of mutual friends with Mr. Fishkin.

How did the leak happen?

Over the call, Mr. Erfan showed Mr. Fishkin the documents, running to more than 2,500 pages of API documentation and containing 14,014 attributes or API features. While it isn’t confirmed who exactly put them up, the document history showed a “yoshi-code-bot /elixer-google-api” as the origin which indicates that Google’s own internal Content API Warehouse possibly accidentally published them on the repo. The code was published on March 27 and stayed up until May 7 allowing enough time for the public to pick them up.

Even though the documents didn’t explicitly share what exactly tickled the Search algorithm to push a story up in ranking, they laid bare a list of factors that Google Search was definitely tracking which in itself is revelatory. The secret sauce of Google’s algorithm has been as much a black box as that of a large language model or the human mind. Company execs have protected details around how the Search ranking works, going out of their way to lie about what’s important when publishing content deceiving marketing professionals and publishers and content makers much of whose jobs revolve around “optimising content on Google Search.”

Mr. Fishkin shared the documents with another SEO veteran and CEO of a marketing agency, iPullRank, Mike King after which both shared their own analyses of the leak popularising findings valuable to an industry working in the dark. A lot of this was debunking what Google employees had lied about.

What has Google lied about?

In the past, in multiple instances, Google had explicitly repeated that domain authority wasn’t considered as a focal point. But turns out, Google has a feature called “siteAuthority”, even though there’s little clarity around how the metric is calculated.

Also, contrary to their previous assertions that clicks aren’t used as a way to calculate ranking, there is solid evidence now that clicks are very much a measure. During his testimony at the Department of Justice (DOJ) antitrust trial in November last year, Vice President of Search, Pandu Nayak spoke about the NavBoost and Glue ranking systems both of which use click-driven ways to boost, demote or reinforce a ranking in Search. Mr. Nayak revealed that Google had been employing NavBoost since 2005 and historically used 18 months of click data. Google reps have also stated earlier that “dwell time” wasn’t a feature but Navboost does indeed consider long clicks which is basically the same thing.

Another major point is that Google may use Chrome data to determine rankings — something that they had denied earlier. Mr. King noted that Chrome appears in more than one module — one related to page quality scores has a site-level measure of views from Chrome, while another module that seems to be related to the generation of sitelinks has a Chrome-related attribute as well.

There isn’t much known about what exactly “twiddlers” are but Mr. King described them as re-ranking functions. Just how important are they? Former Googler, Debarghya Das shared on X that once he had disabled twiddlers without realising that “all of YouTube search depended on it.”

Google also stores the author’s name of the article. “This combined with the in-depth mapping of entities and embeddings showcased in these documents, it’s pretty clear that there is some comprehensive measurement of authors,” Mr. King noted.

The leak is reminiscent of what happened with AOL. Back in 2006, the web portal’s research section accidentally released a compressed file containing 20 million keyword searches by more than 6,50,000 users over a three-month period., in plain sight for everyone to see.

Google’s leak is not as egregious but it does serve as a lesson for journalists and SEO professionals to not take the company’s word a gospel. More than a day after the leak was covered, Google did admit that the data was 100% theirs but “cautioned against making inaccurate assumptions about Search based on out-of-context, outdated, or incomplete information.” But do we believe them anyway?

Disclaimer: The copyright of this article belongs to the original author. Reposting this article is solely for the purpose of information dissemination and does not constitute any investment advice. If there is any infringement, please contact us immediately. We will make corrections or deletions as necessary. Thank you.


ALSO READ

Saudi Arabia jails cartoonist Mohammed al-Hazza for 23 years for insulting leadership, rights group says

Dubai — A Saudi artist has been sentenced to more than two decades in prison over political cartoons...

world | 7 minutes ago

Rain may have helped form the first cells, kick-starting life as we know it

Billions of years of evolution have made modern cells incredibly complex. Inside cells are small com...

science | 15 minutes ago

The Science Quiz: AI in science, from neurons to nodes

Questions: 1. The functioning of organic neurons is the model for artificial neural networks. In bio...

science | 15 minutes ago

Today’s top tech news: Meta’s U.S. legal troubles; Intel and AMD team up; Apple’s new iPad mini

(This article is part of Today’s Cache, The Hindu’s newsletter on emerging themes at the intersectio...

technology | 15 minutes ago

AI firm Perplexity offers a peek into a new financial analysis tool

AI company Perplexity revealed a work-in-progress finance-centric platform that would let users look...

technology | 15 minutes ago

Apple iPhone 16 Pro Max and Samsung Galaxy S24 Ultra | Prices, specs, features compared

As the festival season rolls by, many shoppers in India are considering whether it’s time to take ad...

technology | 15 minutes ago