Leveraging Nvidia's Blackwell for Efficient Inference of Large Language Models

As large language models (LLMs) continue to grow in size and complexity, the demand for efficient inference capabilities becomes paramount. Models like LLama 3.3 405B and DeepSeek-R1, with their billions of parameters, pose significant challenges in terms of computational resources and energy consumption. In this talk, we will explore how Nvidia's latest GPU architecture, Blackwell, is designed to address these challenges.

Abhishek Kumar Gupta

Sr. Staff Engineer @ NVIDIA

Santa Clara, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Leveraging Nvidia's Blackwell for Efficient Inference of Large Language Models

Abhishek Kumar Gupta

Links

Actions