Tuesday, November 5, 2024

Google’s wrong answer to the threat of AI: default to not indexing content | John Naughton

Must read

Once upon a time, a very long time ago in internet years – 1998 – Google was truly great. A couple of lads at Stanford University in California had the idea to build a search engine that would crawl the world wide web, create an index of all the sites on it and rank them by the number of inbound links each had from other sites. In other words, they built a kind of automated peer review for the web, and it came as a revelation to those of us who had been struggling for yonks with AltaVista and other search engines.

The only problem was that Google initially didn’t have a business model (partly because the founders didn’t like advertising) but in 2000 it came up with one. It involved logging everything that users did on the platform, analysing the resulting data stream so that its real customers – advertisers – would know what users might be interested in.

The model came to be called surveillance capitalism and Google profited mightily from it. But after a while the process known as enshittification inexorably set in, as it has with every platform that engages in that particular kind of capitalism. It’s a process that goes like this: first, you offer high-quality services to attract users (as Google did), then you shift to favour business customers (thereby increasing profitability), before finally focusing on maximising profits for shareholders at the expense of users and business customers alike.

As enshittification unfolds, the experience of a platform’s hapless users steadily and inexorably deteriorates. But most of them put up with it because of inertia and the perceived absence of anything better. The result is that, even as Google steadily deteriorated, it remained the world’s dominant search engine, with a monopolistic hold in many markets across the world; “Google” became a verb as well as a noun and “Googling” is now a synonym for online searching in all contexts.

The arrival of ChatGPT and its ilk threatens to upend this profitable applecart. For one thing, it definitely disrupts search behaviour. Ask a chatbot such as Perplexity.ai a question and it gives you an answer. Search for the topic on Google and it gives you a list of websites (including ones from which it derives revenue) on which you then have to click in order to make progress. For another, if users shift to chatbots for information, they won’t be exposed (at least for now) to lucrative search ads, which account for a significant chunk of Google’s revenue. And over time, experience with chatbots will change people’s expectations about searching for information online.

Overhanging all this, though, is the fact that generative AI is already flooding the web with AI-generated content that is good, bad and indifferent. All of a sudden, Google’s mission – “to organise the world’s information and make it universally accessible” – looks like a much more formidable task in a world in which AI can generate infinite amounts of humanlike content. How does automated peer review work in that environment? How do you separate wheat from automated chaff?

One intriguing clue to how Google may be thinking about the problem surfaced last week. Vincent Schmalbach, a respected search engine optimisation (SEO) expert, thinks that Google has decided that it can no longer aspire to index all the world’s information. That mission has been abandoned: instead, Google search will be governed by an acronym: EAT – expertise, authoritativeness, trustworthiness.

skip past newsletter promotion

“Google is no longer trying to index the entire web,” writes Schmalbach. “In fact, it’s become extremely selective, refusing to index most content. This isn’t about content creators failing to meet some arbitrary standard of quality. Rather, it’s a fundamental change in how Google approaches its role as a search engine.” The default setting from now on will be not to index content unless it is genuinely unique, authoritative and has “brand recognition”.

“They might index content they perceive as truly unique or on topics that aren’t covered at all,” says Schmalbach. “But if you write about a topic that Google considers even remotely addressed elsewhere, they likely won’t index it. This can happen even if you’re a well-respected writer with a substantial readership.”

If this is indeed what Google is up to, then you have to wonder what its leaders have been smoking. Among other things, they’re proposing to build machines that can sensibly assess qualities such as expertise, authoritativeness and trustworthiness in an online world where just about anything goes. Could someone please take them aside and remind them that a tech company tried something like this way back in 1995 and came unstuck. It was called Yahoo! Remember it? Me neither.

What I’ve been reading

Waugh report
Putting the Boot In is a lovely essay by Robert Hutton on British journalism, as satirised by Evelyn Waugh and embodied by Boris Johnson.

Cause and effect
Does Social Media Cause Anything? is a fabulous post by Kevin Munger on the Crooked Timber blog.

Dream machines
Helen Beetham’s Chips With Everything is a scorching Substack post on Tony Blair’s fantasies about AI.

Latest article