AWS Trainium2 Clusters to Power Apple Intelligence

by | Dec 16, 2024

Trainium 2

Benoit Dupin, senior director of machine learning and AI, Apple

At the AWS re:Invent 2024 event, an Apple executive said they were exploring a powerful cluster of AWS Trainium2 chips for model training and inferencing.

Even as Apple begins to roll out updates with Apple Intelligence for its devices, you’d be surprised to know that the massive computing power required for training and inferencing of its large language models comes from AWS (Amazon Web Services). At the AWS re:Invent 2024 event in Las Vegas last week, AWS announced updates to its own silicon chips and technology for processing AI workloads, notably, AWS Trainium 2 (TR2) processors. This also makes AWS a formidable competitor to NVIDIA, although AWS is an NVIDIA partner. 

Apple Intelligence is a personal intelligence technology that is now being integrated into Apple’s products with OS updates. According to Apple, the AI-based technology draws on the user’s content without allowing anyone to access their personal data. Built on top of Apple’s own LLMs, it offers a generative AI experience for Apple users on their devices.

During a keynote address at the 13th edition of AWS re:Invent in Las Vegas last week, Benoit Dupin, Sr. director of machine learning and artificial intelligence at Apple said “We work with AWS services across virtually all phases of our AI and ML lifecycle.”

To date, Apple has been using AWS services with Inferentia 2 and Graviton 3 processors. Apple Intelligence is powered by its LLMs, diffusion models, and adaptors, which run on both devices and servers, Dupin said.

However, to scale the infrastructure behind Apple Intelligence, Apple is exploring AWS’s Trainium2 (TR2) chips that will be strung together in an EC2 UltraCluster that includes hundreds of thousands of chips to form a massive supercomputer. 

Dupin said Apple depends on AWS Xcelerators through its Trainium2 clusters and is in the early stages of exploring Trn2 chips. After deployment, it expects 50% improvement in training to support its scaling.

At re:Invent 2024, AWS announced the general availability (GA) of TR2 chips and said TR3 will be coming next year. An AWS spokesperson said TR3 chips will be manufactured on a 3 nm process and offer 2X more computing than TR2.

AWS Trainium chips are a family of AI chips purpose-built by AWS for AI training and inference to deliver high performance while reducing costs.

According to an AWS statement, together with Anthropic, AWS is building an EC2 UltraCluster of Trn2 UltraServers – named Project Rainier – containing hundreds of thousands of Trainium2 chips and more than 5x the number of exaflops used to train their current generation of leading AI models.

“Trainium2 is purpose-built to support the largest, most cutting-edge generative AI workloads, for both training and inference and to deliver the best price-performance on AWS,” said David Brown, vice president of Compute and Networking at AWS. “With models approaching trillions of parameters, we understand customers also need a novel approach to train and run these massive workloads. New Trn2 UltraServers offer the fastest training and inference performance on AWS and help organizations of all sizes to train and deploy the world’s largest models faster and at a lower cost.”

Share This Article!

Brian Pereira
Brian Pereira
A veteran technology editor with over 30 years of experience, Brian began his career at The Indian Express in 1994. He has since reported for premier publications including The Times of India, BW Business World, CHIP, and InformationWeek. He also produced the CeBIT and INTEROP conferences in India. He has since retired and consults for media organizations and confernece companies. Write to Brian: [email protected] LinkedIn: ​https://www.linkedin.com/in/pereirabrian/ Muckrack: brian-pereira-6 X: https://x.com/creed_digital Substack: @brianper
Recommended Posts

Similar Articles