Monday, December 23, 2024

Beyond the gen AI hype: Google Cloud shares key learnings

Must read


We want to hear from you! Take our quick AI survey and share your insights on the current state of AI, how you’re implementing it, and what you expect to see in the future. Learn More


Is bigger always better when it comes to large language models (LLMs)? 

“Well, the answer is quite simply yes and no,” Yasmeen Ahmad, managing director of strategy and outbound product management for data, analytics and AI at Google Cloud, said onstage at VB Transform this week. 

LLMs do get better with size — but not indefinitely, she pointed out. Huge models with a large number of parameters can be outperformed by smaller models trained on domain and context-specific information. 

“That indicates that data is at the cornerstone, with domain-specific industry information giving models power,” said Ahmad. 


Register to access VB Transform On-Demand

In-person passes for VB Transform 2024 are now sold out! Don’t miss out—register now for exclusive on-demand access available after the conference. Learn More


This allows enterprises to be more creative, efficient and inclusive, she said. They can tap into data that they’ve never been able to access before, “truly reach” all corners of their organization and enable their people to engage in all new ways. 

Gen AI is pushing the boundaries of what we could even dream machines could create, or humans could imagine,” said Ahmad. “It truly is blurring the lines of technology and magic — perhaps even redefining what magic means.”

Enterprises need a new AI foundation

Successfully training models on a specific enterprise domain comes down to two specific techniques: fine-tuning and retrieval augmented generation (RAG), said Ahmad. Fine-tuning teaches LLMs “the language of your business,” while RAG allows the model to have a real-time connection to data, whether in documents, databases or elsewhere. 

“That means in real-time, it can provide accurate answers which are really important for financial analytics, risk analytics and other applications,” said Ahmad. 

Similarly, the true power of LLMs is in their multimodal capabilities, or their ability to operate on video, image, text documents and all other types of data. This is critical, she noted, as typically 80 to 90% of data in an enterprise is multimodal. 

“It’s not structured, it’s documents, it’s images, it’s videos,” said Ahmad. “So having a LLM to be able to tap into that data is super valuable.” 

In fact, Google did a study that showed a 20 to 30% improvement in customer experience when multimodal data was used. Enterprises had enhanced ability to hear and understand customer sentiment and the model was able to bring together data on product performance and market trends. 

“To put it simply, it’s not about simple pattern recognition anymore,” said Ahmad. “LLMs can truly understand the complexity of our organizations by having access to all data.” 

Traditional organizations struggle with traditional data foundations that were never built to handle multimodal — but the future of AI and business data demands a new kind of AI foundation, she pointed out.

AI that is conversational, a ‘personal data sidekick’

The ability to engage in question-answer interactions is another critical component of successful LLMs, Ahmad emphasized. 

But, while it’s “super alluring to be able to chat with your business data, it’s not so easy,” she noted.

Imagine asking a colleague the forecasted sales for the next quarter for new products. If you don’t give them context, or if they don’t understand the fiscal quarters or even the new products themselves, they are going to give you a “vague and unhelpful” answer, said Ahmad. The same is true for LLMs — they must be given semantic context and metadata so they can provide specific and accurate answers.

Similarly, it’s important that models are conversational. “As humans, when we do analysis, or we ask questions, we typically go back and forward in a dialog, and we call on and provide additional context until we get to an answer,” said Ahmad. It’s exactly the same for LLMs: They need to be able to have a coherent conversation. 

As such, the industry is moving away from isolated, single-shot, one question interactions to “the next generation of conversational AI.” This is more than a chatbot: “Think of it more like a personal data sidekick,” she said. 

It is a “tireless worker” that interacts and is able to ask questions and engage in a chain of thought. It also provides thorough query transparency, so human users know where the results came from and can trust them. “We’re seeing a quantum leap, agentic AI that can actually make decisions, take action and work towards a goal,” said Ahmad, noting that scientists are teaching these models to become “seriously clever.”

LLMs are beginning to mimic human brains — notably in the way they can break things into sub tasks — and they have the ability to be “strategic thinkers,” understand cause and effect and learn honesty.

All of this is being done quicker and quicker, with real-time capabilities improving all the time, said Ahmad. “The future is here and the future is spawning new breeds of business,” she said. “We are at the beginning of what this technology can enable.” 

Latest article