Session
Techniques for Scaling Large Models with Model & Data Parallelism
Discover how to train and serve massive AI models efficiently by leveraging both model and data parallelism. In this session, we’ll explore how to partition large models across GPUs and distribute data for optimal throughput, diving deep into practical setup details and performance benchmarks.
We’ll also address the key tradeoffs—such as latency vs. resource usage—and show how to tailor parallelization strategies to different AI tasks, going beyond transformers into computer vision and more.
By the end, you’ll have a holistic understanding of how to design and deploy parallelized workflows that balance accuracy, speed, and infrastructure costs, enabling you to scale AI solutions effectively in real-world scenarios.

Shashank Kapadia
Machine Learning Engineering | Building Scalable AI Solutions | NLP & Personalization | Ethical AI Advocate | Mentor | Writer
Sunnyvale, California, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top