Session
Integrating WasmEdge with Kubernetes for Managing LLM Workloads on GPUs
In this talk, I will introduce WasmEdge as a CNCF Sandbox project and highlight its seamless integration with existing cloud-native infrastructures such as Kubernetes, Podman, and CRI-O. We'll explore how these tools enable the deployment, management, and execution of lightweight WebAssembly applications.
Moving forward, I'll delve into managing LLM (Large Language Model) workloads on GPUs using advanced container tools. Specifically, we'll discuss a cutting-edge approach that combines Podman, Crun, WasmEdge, and CDI to effectively utilize host GPU devices.
To demonstrate the practical application of this approach, I'll conduct a live demo showcasing the deployment and execution of the Llama model using our WASM application.
Yongkang He
Founder @KSUG.AI @KubeSmart.AI | Creator @awstronaut @kubestrong
Singapore
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top