Techniques for Scaling Large Models with Model & Data Parallelism

Discover how to train and serve massive AI models efficiently by leveraging both model and data parallelism. In this session, we’ll explore how to partition large models across GPUs and distribute data for optimal throughput, diving deep into practical setup details and performance benchmarks.

We’ll also address the key tradeoffs—such as latency vs. resource usage—and show how to tailor parallelization strategies to different AI tasks, going beyond transformers into computer vision and more.

By the end, you’ll have a holistic understanding of how to design and deploy parallelized workflows that balance accuracy, speed, and infrastructure costs, enabling you to scale AI solutions effectively in real-world scenarios.

Shashank Kapadia

Sunnyvale, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Techniques for Scaling Large Models with Model & Data Parallelism

Shashank Kapadia

Links

Actions