Ironwood: The fastest path from model training to planet-scale inference

1 day, 20 hours ago google cloudblog

Today’s frontier models, including Google’s Gemini, Veo, Imagen, and Anthropic’s Claude train and serve on Tensor Processing Units (TPUs). For many organizations, the focus is shifting from training these models to powering useful, responsive interactions with them. Constantly shifting model architectures, the rise of agentic workflows, plus near-exponential growth in demand for compute, define this new age of inference. In particular, agentic workflows that require orchestration and tight coordination between general-purpose compute and ML acceleration are creating new opportunities for custom silicon and vertically co-optimized system architectures.

We have been preparing for this transition for some time and today, we are announcing the availability of three new products built on custom silicon that deliver exceptional performance, lower costs, and enable new capabilities for inference and agentic workloads:

Ironwood, our seventh generation TPU, will be generally available in the coming weeks. Ironwood is purpose-built for the most ...

Copyright of this story solely belongs to google cloudblog . To see the full text click HERE

Share: