The advanced Trillium TPUs will showcase a nearly five times increase in peak compute performance per chip compared to its previous version TPU v5e and it is featured with the third generation of specialised accelerator SparseCore

Google  Trillium

Google unveils the company’s sixth-generation custom AI-specific hardware TPU dubbed Trillium. (Credit: Hameltion/Wikimedia Commons)

Google has introduced the company’s sixth-generation custom artificial intelligence (AI)-specific hardware tensor processing unit (TPU), dubbed Trillium.

According to the search engine major, the Trillium TPUs will showcase a nearly five times increase in peak compute performance per chip compared to its previous version TPU v5e.

The company has also augmented the high bandwidth memory (HBM) capacity and bandwidth twofold, along with doubling the interchip interconnect (ICI) bandwidth over TPU v5e.

Besides, Trillium is featured with third-generation SparseCore, a specialised accelerator designed for processing ultra-large embeddings often seen in advanced ranking and recommendation workloads.

Google ML, systems, and cloud AI vice president Amin Vahdat said: “Trillium achieves 4.7X peak compute per chip compared to TPU v5e. To achieve this level of performance, we’ve expanded the size of matrix multiply units (MXUs) and increased the clock speed.

“Additionally, SparseCores accelerate embedding-heavy workloads by strategically offloading random and fine-grained access from TensorCores.”

The latest Google Cloud TPU will facilitate the training of the next wave of foundation models faster and support those models with lower latency and reduced cost.

Trillium TPUs are also more than 67% more energy-efficient than TPU v5e, said Google.

Furthermore, Trillium’s scalability extends to accommodating up to 256 TPUs within a single high-bandwidth, low-latency pod.

In addition, by leveraging multislice technology and titanium intelligence processing units (IPUs), Trillium TPUs can extend beyond pod-level scalability to encompass hundreds of pods.

It will link tens of thousands of chips within a building-scale supercomputer that are interconnected by a multi-petabit-per-second datacenter network.

Trillium TPUs are also part of Google Cloud’s AI Hypercomputer. It is a supercomputing architecture tailored specifically for AI workloads.

The sixth generation of Google Cloud TPUs will be made available later this year.

Last month, Google revealed its plans to combine the company’s device, software, and platforms architecture (DSPA) and platforms and ecosystems (P&E) teams to create a new product area (PA) dubbed Platforms & Devices.