Wednesday, October 16, 2024

‘AI-Mazing Tech-Venture’: National Archives Pushes Google Gemini AI on Employees

Must read

In June, the U.S. National Archives and Records Administration (NARA) gave employees a presentation and tech demo called “AI-mazing Tech-venture” in which Google’s Gemini AI was presented as a tool archives employees could use to “enhance productivity.” During a demo, the AI was queried with questions about the John F. Kennedy assassination, according to a copy of the presentation obtained by 404 Media using a public records request. 

In December, NARA plans to launch a public-facing AI-powered chatbot called “Archie AI,” 404 Media has learned. “The National Archives has big plans for AI,” a NARA spokesperson told 404 Media. It’s going to be essential to how we conduct our work, how we scale our services for Americans who want to be able to access our records from anywhere, anytime, and how we ensure that we are ready to care for the records being created today and in the future.”

Employee chat logs given during the presentation show that National Archives employees are concerned about the idea that AI tools will be used in archiving, a practice that is inherently concerned with accurately recording history. 

One worker who attended the presentation told 404 Media “I suspect they’re going to introduce it to the workplace. I’m just a person who works there and hates AI bullshit.” 

The presentation was given about a month after the National Archives banned employees from using ChatGPT because it said it posted an “unacceptable risk to NARA data security,” and cautioned employees that they should “not rely on LLMs for factual information.” 

“Google Gemini is a versatile tool that can help users save time, improve their productivity, and achieve better results in their work,” one of the slides says. “Think of Gemini as a co-worker that can help generate ideas and review content that you have already created.”

Portions of the slides that suggest specific use cases are redacted, but the presentation recommends using it for “writing assistance, data visualization, meeting summaries, and idea generation.” 

“Generate text, translate languages, and summarize documents to help users communicate more effectively,” one of the slides reads. 

During the presentation, which was given over Zoom, employees expressed many concerns about the technology in the chat. The National Archives refused to release video of the chat, citing privacy concerns, so it is not clear if the presenter answered any of the questions and how they were answered. All names of employees asking questions were redacted by the National Archives. 

“How would the public know that they are receiving a response from an actual archivist and not from generative AI?,” one employee asked. “Would NARA disclose what aspects of reference are generated from AI?” Two other employees followed up and said that they are also concerned about this, and one said “I worry it might lower trust in the institution if not properly disclosed.” 

According to the chat logs, a live demonstration of Google’s Vertex AI was given in which it pretended to be an “expert archivist” and was asked questions about the John F. Kennedy assassination. Vertex AI allows organizations to train LLMs on their own datasets. In this case, the AI was trained on National Archives data. These questions included “Who killed Kennedy?” and “What was the CIA’s involvement in the assassination of Kennedy?” 

“Why is the Generative AI calling itself an ‘expert archivist?’” an employee asked. “It’s called ‘expert archivist’ because that is the prompt we gave it,” someone involved with the demo said. 

“I have a serious problem with the ‘expert archivist’ title,” another employee said. 

“Same here. If we have a disclaimer saying the generative AI can make things up and yet call it an expert archivist on the same tier as actual human experts…,” another chimed in.

“Ask what happened to Kennedy’s brain,” an employee said at one point (this is actually, famously, a mystery).

Archives employees seemed to have serious concerns with the demo and the presentation.

At one point, an employee said “classified data cannot be put through cloud AI.” Another employee asked “are we able to opt out any data on google drive or our emails from Gemini?” 

Another said “Do you have any concerns that this product will malfunction similarly to how Google Search AI has recently?”

Employees also asked “How do you plan to ensure NARA isn’t drawn into any copyright infringement issues by using AI models that are trained on web content? There have been issues with this happening with ChatGPT scraping pirated books, for instance.”

One employee asked “Is this demo in prep for rolling this out to employees NARA wide?” 

Someone responded: “we are doing a pilot right now of this technology to determine if we should move forward with this as an agency.” 

“AI is meant to generate something that sounds like an answer—there are a plethora of cases of it spouting things that are completely wrong with an authoritative tone,” another said. “How much are we going to be expected to rely on this in the future?”

Three separate employees expressed concern about the environmental aspects and carbon footprint of generative AI. 

In an email I obtained earlier this year, the National Archives told employees it prefers Google Gemini and Microsoft Copilot to ChatGPT because they offer “a more controlled environment.” 

A NARA spokesperson told 404 Media that the agency has big plans for AI, which include launching a public-facing AI-powered chatbot called ‘Archie.’” 

“We are exploring how AI can help us increase access to our holdings around the country. We currently have a handful of AI pilot programs aimed at improving our service to the public while fostering public trust and confidence,” they said. “Ultimately, we want the user to be able to easily find the documents they are looking for in our enormous trove of permanent federal records. Whether you are a veteran, a family historian, an educator, a researcher, or a student, our goal is to connect you with the records as seamlessly as possible.”

NARA plans to tell users, essentially, that Archie may give people incorrect information.

“Our ArchieAI tool will directly address questions of accuracy and disclosure,” the spokesperson said. The specific disclosure will say: “Accuracy: AI-generated summaries and results may not reflect the opinion of NARA and are not guaranteed to be accurate. Historical records often contain factual errors or offensive language, which ArchieAI may repeat or use.”

The Biden administration previously directed federal agencies to study AI and create policies for its use. The National Archives also recently gave a presentation on AI to the International Council on Archives.

In that presentation, Carol Lagundo, director of digital partnerships and outreach at NARA, announced Archie and also explained that the National Archives had used AI to “improve access to Revolutionary War pension files,” which are a set of more than 2.5 million pages of 18th and 19th century handwritten records about Revolutionary War soldiers. 

“At the current rate, it will take until 2046 to completely transcribe this series with just humans!,” Lagundo’s presentation says. She said that an AI transcript of the dataset was 90 percent correct and that it intends to share these transcripts with the public in its official catalog in November or December.

She added that the National Archives is developing a “prototype AI research assistant” powered by Google Vertex called Archie AI. 

“You’ll be able to ask Archie a question and receive AI-generated summaries with footnotes and links to the digitized documents in our catalog,” she said. “We’re hoping to roll it out in a few months.”

“As you can tell, ArchieAI is a cornerstone of our AI learning,” the NARA spokesperson told 404 Media.

Latest article