Monday, November 25, 2024

Google’s Trillium TPU achieves unprecedented performance increase for AI workloads

Must read

AI Hype Train: Tensor Processing Units are specialized ASIC chips designed to accelerate machine learning algorithms. Google has been employing TPUs since 2015 to enhance its ML-based cloud services, and the company is now fully embracing the latest TPU generation for an even more efficient and powerful AI accelerator platform.

At this year’s I/O developer conference, Google announced its “most advanced” TPU yet. Trillium, the machine learning algorithm accelerator, represents the culmination of over a decade of research on specialized AI hardware and is a fundamental component needed to construct the next wave of AI foundation models.

Google explained that the first TPU was developed in 2013, and without TPUs, many of the company’s most popular services would not be possible today. Real-time voice search, photo object recognition, language translation, and advanced AI models like Gemini, Imagen, and Gemma all benefit TPUs.

Like its predecessors, Trillium has been designed from the ground up to accelerate neural network workloads. Google’s 6th gen TPU achieves 4.7x peak performance per chip compared to the previous TPU generation (v5e), thanks to the adoption of larger matrix multiply units and a higher clock speed.

Trillium chips are equipped with third-generation SparseCore, a dedicated accelerator for processing “ultra-large embeddings” common in advanced ranking and recommendation workloads. Additionally, the new TPUs boast doubled High Bandwidth Memory capacity and bandwidth, along with double interconnect bandwidth compared to the v5e generation.

Despite being much more powerful and capable, Trillium is also more sustainable. Google states that the 6th gen TPUs are over 67 percent more energy efficient than TPU v5e. The corporation listed some of the advanced AI-based capabilities Trillium is expected to provide to customers, such interactions between humans and cars that Essential AI is working on.

Trillium will also provide AI acceleration to Nuro, a company working on AI models for robots, Deep Genomics for advanced drug discovery, and Deloitte, which aims to “transform” businesses through generative AI. Google DeepMind will also use Trillium TPUs to train future versions of Google’s own foundation models in the Gemini line.

Trillium is part of the AI Hypercomputer, a supercomputer architecture Google has designed for managing the most advanced AI workloads. In the AI Hypercomputer, a TPU-based optimized infrastructure and open-source software frameworks will work together to train (and serve) AI models of the future.

Third-party companies will be able to access new Trillium-based cloud instances sometime later this year.

Latest article