Saturday, November 23, 2024

AI-Powered Search and the Rise of Google’s “Concierge Wikipedia” | TechPolicy.Press

Must read

 AI-Powered Search and the Rise of Google’s “Concierge Wikipedia” | TechPolicy.Press

April 18, 2024: Sundar Pichai is the CEO of Alphabet Inc. and its subsidiary Google. New York. Shutterstock

On May 14th, Google announced the rollout of its new AI-powered search tool, “AI Overviews,” a Gemini model customized for Google Search. This new search experience promises to generate dynamic content tailored to individual queries, offering more diverse and comprehensive results. According to Google, it will be rolled out to over a billion people by the end of the year. As an early experimental user of the technology, and an information professional with a background in information literacy, I have been anticipating this announcement and watching the discussion unfold. Much conversation around AI Overviews explores its potential to significantly impact online commerce by influencing web traffic flows. The economic impact may well be significant. But what most of the discussion has ignored thus far is the far more consequential shift this will create in the information ecosystem.

While any information discovery tool shapes user interactions with information, the way that AI Overviews allows for iterative user questions, generates on-the-spot answers, and differently displays information and source material stands to redefine how users access and understand that information. AI Overviews appear at the top of the search results page above the traditional Page Rank list of source hyperlinks and previews and is a customized response to each user search. The technology boasts enhanced user convenience while simultaneously raising concerns about how users will come to intuit how information is created, disseminated, and accessed – core competencies of information literacy.

Given growing research and policy interest in how AI will shift information landscapes, affect information diets, and influence global democracy, keeping a keen eye on how users use such search tools in a variety of contexts will be paramount to monitoring the health of online information ecosystems. This is true not just because of growing research and policy interest, but also because these and related tools will absolutely change how we access and use information – with unintended and unanticipated consequences that we may be unprepared to monitor, let alone mitigate.

Seductive synthesis

AI Overviews are alluring as they summarize and synthesize search results in seconds. The convenience often hides the remarkable reality that we are witnessing the birth of a completely new information object, a kind of tertiary source like a concierge Wikipedia, meant to respond with its advanced reasoning capabilities to increasingly complex user queries. Opposed to older forms of search, these tertiary responses are generated with much less emphasis on the primary and secondary sources from which the generated answer is pulled.

Synthesis is a high-order cognitive skill, along the lines of analysis. With AI Overviews, the labor of synthesis is being offloaded with the promise that saved time and energy can be spent on other, more meaningful tasks. But what those tasks are is yet to be seen. Over-dependence on such systems risks “enfeeblement,” undermining the ability to synthesize and analyze for oneself. While these are cognitive skills, moral decision making processes and other ethical skills may also be affected.

At the end of the day, information evaluation comes down to a tricky exercise in navigating and assessing uncertainty and complexity. A mass deskilling in this domain, where information consumers increasingly cede value-based decisions to automated systems, effectively outsourcing critical thinking skills to AI, will most certainly impact the public’s ability to engage in deep and nuanced understanding of the world around them.

Synthetic mediation

AI Overviews functions as a semi-translucent veil in between what traditionally has been a relationship between a user and a source. Sure, previous iterations of Google search did this too – providing highlighted previews within search results that responded to a question. While Google Search’s algorithm was honed to predict results based on user behavior and other user data points (including advertising), AI Overviews goes further, guiding users to understand the information gathered. It acts as more of a mediator, shaping user interaction with the search content.

And by helping users understand that information, it also has the power to subtly affect what they think about it. The power these overviews yield is enormous and will need to be understood from within diverse communities. This includes understanding the effects of increased user reliance on convenient but potentially superficial information overviews, the overviews’ potential to affect widespread understanding of sensitive topics, and their capacity to incite even more polarization of online communities already driven by distrust of mainstream media.

De-contextualized information

AI Overviews emphasizes the generated answer over individual sources, pulling those sources out of their original contexts and amalgamating them. Sources are embedded by expandable carrots or in a horizontal carousel scroll feature and because sources are embedded in the answer, the overviews carry a sense of completeness that subtly discourage deeper exploratory behaviors like comparing and contrasting sources or personally corroborating information using primary source material. In this sense, AI Overviews ultimately privilege summarization at the expense of exploration and browsing.

Information professionals have tried (and failed) to teach users for some time that the first search results aren’t always the best; AI Overviews are both the first search results and also represent the machine-determined consensus of many search results. But in reality they can never represent all of them. While AI Overviews seem comprehensive, given the well-documented biases in any LLM system in conjunction with the already unequally-representative information landscape on the web, AI Overviews run the risk of overlooking diverse perspectives, particularly from historically marginalized communities.

Whether it is the lack of diverse information in generated overviews, or the extra effort users must take to explore and contextualize multiple sources rather than consult the AI-generated results, representation not only in answers but in user engagement with diverse sources suffers. This further incites questions of source attribution if users rely on and cite only the generated answer and not the source material from which that answer was pulled.

Artificial objectivity and reliability

Google explained that in its experimental version of AI Overviews it “finetuned the model to provide objective, neutral responses that are corroborated with web results.” This was done so the model would not take on, or reflect, any kind of persona which might influence users on an emotional level. However, a disinterested and neutral tone also has an emotional and logical appeal to readers and can obscure any inherent biases within the system. The impartial presentation is very convincing, especially when the answers are useful and generally reliable. But all LLMs still experience hallucinations where they confidently generate incorrect or nonsensical information, further highlighting the need for critical evaluation of their outputs.

The potential for hallucinations, along with incomplete/biased training data, makes neutrality in any human-made system impossible, while the performance of a neutral tone makes it seem quite the opposite. Source material, even data-driven content, can be misleading depending on its packaging and presentation. Users often seek quick markers of reliability, and the objective presentation of AI Overviews provides a veneer of reliability but ultimately prioritizes convenience and quick judgments over thorough analysis. Ultimately, AI Overviews is designed to promote superficial quick looks over in-depth exploration, and the neutral tone ensures that users don’t linger too long anywhere or think too hard about anything because the results feel “true.”

Recommendations

Given potential major impacts to user behaviors and the resulting ramifications on information ecosystems, I recommend the following:

  1. Researcher access to robust user data will be crucial to understand how information ecosystems are evolving and the transformation of information consumption patterns affecting everything from the economy to democracy. Making such data available for widespread scholarly study will be paramount to promoting ethical search interfaces, prioritizing transparency, and promoting tools for users to access and critically evaluate original source materials.
  2. Relatedly, schools need to develop robust curriculum which includes teaching students how to critically search using information discovery tools that utilize generative AI. Information literacy has been unevenly taught thus far, so developing curriculum quickly and soon will be pivotal in helping users grow to use this technology responsibly.
  3. Google should constantly monitor and evolve its interface design to ensure users understand the limitations of AI Overviews. Google should also analyze user data with an eye towards helping users not only get quick answers, but encourage thoughtful online search behaviors.

AI Overviews are just a small part of how generative AI will affect information ecosystems online. Because this is just the beginning, having meaningful conversations early on about how users are adapting their behaviors and the consequences that will have on our online information landscapes is essential so that we can continue to understand how the systems work, and their evolving effects on the people that use them.

Latest article