IBM has announced its Telum II processor with shared on-chip AI and, perhaps surprisingly, a “Spyre” AI accelerator card capable of LLM generative AI. The new chips will ship in the yet-to-be-announced next-generation IBM Z.
Three years ago, IBM introduced Telum, a mainframe processor with an on-board AI accelerator shared across eight cores. IBM is set to unveil its upcoming Tellum II processor and AI accelerator at the annual HotChips conference on the Stanford campus. Here's a look at this ever-evolving platform that processes 70% of the world's financial transactions.
More I/O, more caching, and more…
IBM engineers must have determined that their Z16 mainframe processors are I/O bound. To what extent is this due to Telum I's new on-board AI accelerator? We don't know. But IBM deemed it important enough to allocate significant die area to solving this problem and eliminating the I/O bottleneck in mainframe throughput. The new IBM chips feature higher frequencies, higher memory capacity, 40% more cache, updated integrated AI accelerator cores, and a coherently connected data processing unit (DPU). The DPU accelerates complex IO protocols for networking and storage, simplifying system operations and improving performance.
Further improvements in on-chip and off-chip AI acceleration
IBM is taking an ensemble approach to AI processing on 'Frame. First, we upgraded the on-board AI processors that handle routine tasks such as in-transaction fraud detection, payments, compliance, and claims processing. Z clients are getting a lot of value from the on-board AI processors. Still, they need to prepare for a wave of opportunities for large language models (LLMs) in transaction processing.
Tellum II On-chip AI Processor
IBM
To process LLM on the mainframe, IBM adapted designs from the IBM Research AI Hardware Center to create the compute- and memory-intensive Spyre processors for AI acceleration on IBM Z. While the AI acceleration engines in the Telum chips can handle traditional machine learning, Z applications can deploy the latest LLM on these new Spyre accelerators without compromising the security and availability of the IBM Z mainframe.
Spyre features up to 1TB of memory built to work together across eight cards in a typical IO drawer, supporting full mainframe AI model workloads with less than 75W of power consumption per card. Each chip has 32 compute cores supporting int8, fp8 and fp16 data types for low-latency inference processing and high-throughput AI applications.
New AI Accelerator for Z
IBM
“Our robust, multi-generational roadmap enables us to stay ahead of technology trends, including the growing demand for AI,” said Tina Tarquinio, vice president, product management, IBM Z and LinuxONE. “Telum II processors and Spyre accelerators are built to deliver high-performance, secure and power-efficient enterprise computing solutions. After years of development, these innovations will be introduced in our next-generation IBM Z platform, enabling clients to take advantage of LLM and generative AI at scale.”
The Telum II processors and IBM Spyre accelerators are manufactured by Samsung Foundry and are built on a high-performance, power-efficient 5nm process node.
Conclusion
AI will transform nearly every workload in the data center, including those that run on the mainframe. IBM developed the new Telum processor to enable machine learning on Z, and the new Spyre accelerator to make LLM part of the mainframe environment. This breathes new life into Z by eliminating the need to move data from Z to GPU-powered servers outside the reliability and security of the mainframe.
Disclosure: This article expresses the opinions of the author and should not be taken as advice to buy or invest in any of the companies mentioned. My company, Cambrian-AI Research, is fortunate to have many semiconductor companies and numerous investors as clients, including BrainChip, Cadence, Cerebras Systems, D-Matrix, Esperanto, Groq, IBM, Intel, Micron, NVIDIA, Qualcomm, Graphcore, SImA,ai, Synopsys, Tenstorrent, and Ventana Microsystems. We have no investment in any of the companies mentioned in this article. For more information, please visit our website: https://cambrian-AI.com.