Powering your Generative AI Workloads with AMD and Open-Source ROCm

Presented at AI_dev: Open Source GenAI & ML Summit Europe in Paris, France - June, 2024
View Recording: https://www.youtube.com/watch?v=k2g_lC0fI-k

In the generative AI ecosystem today, there is a strong emphasis on expensive AI hardware and proprietary CUDA implementations. While CUDA has undeniably played a crucial role in the success of generative AI, I’d like to share my experience with running generative AI workloads and applications on cost-effective AMD hardware and the open-source ROCm software stack. This alternative approach aims to provide users with greater flexibility and options, allowing them to apply their generative AI solutions across a wider range of hardware and software choices than ever before.

Learn how to run your favourite open source large language and image generation models using ROCm, how far ROCm has come from previous versions and what features are currently supported, including PyTorch, HuggingFace Transformers, BitsandBytes, Flash Attention, vLLM and TorchTune, and how more affordable workstation and server class AMD GPUs compare to their Nvidia counterparts in terms of performance and inference speed. You will also see several demos of ROCm in action and some tips and things to watch out for when working with AMD GPUs.

Farshad Ghodsian

Sr. Technical Product Manager - AI Infrastructure & MLOps @ AMD

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Powering your Generative AI Workloads with AMD and Open-Source ROCm

Farshad Ghodsian

Links

Actions