Samzong Lu

PM at DaoCloud, AI/LLMOps PM Leader, CNCF Multiple Project Contributors, Open Source Enthusiast

Shanghai, China

Actions

- Samzong is Product Manager (Focus on AI/LLMOps, Muti-Cluster, Cluster LCM, Microservice, ServiceMesh )
- Kubernetes / Kubernetes-sigs active Contributor
- Karmada active Contributor, member
- Istio active Contributor, member
- CNCF Multiple Project Contributors
- CNCF Open Source Enthusiast

Area of Expertise

Information & Communications Technology
Physical & Life Sciences
Transports & Logistics

Topics

Karmada project member
Isito project member
Product Manager
OpenSource

LLM-D：面向云原生的大模型部署框架与实践

LLM-D（Large Language Model Deployment）是一套基于 Kubernetes 的大模型部署框架，旨在简化和加速大语言模型在云原生环境中的全生命周期管理。作为一名 AI 开发者和开源贡献者，我探索了如何借助 Kubernetes 及其生态工具，让大模型的部署过程更具可重复性、可扩展性和成本效率。本次分享将介绍 LLM-D 的核心实践模式：从模型容器化、分布式推理优化，到自动化上线与治理。结合 DaoCloud 的真实案例，我将展示如何通过 LLM-D，帮助团队快速从原型验证走向生产级 LLMOps 流水线，让开发者在保持高效交付的同时实现稳定运营。

From Hugging Face to Cloud Native: Igniting the LLM Revolution with Kubernetes and Open Source Tools

In the wave of AI, open-source large language models (LLMs) such as LLaMA, Gemma, and DeepSeek are reshaping the technological landscape, but how to efficiently deploy these models from prototype to production remains a challenge for developers. This presentation will share how to build a scalable and efficient LLMOps pipeline using Kubernetes and open-source tools, covering the entire process from model download to inference optimization. Based on my experience at DaoCloud and open-source projects (such as the Hugging Face model download GUI and Kubernetes configuration tool), I will demonstrate how cloud-native technologies can simplify LLM deployment, including practical cases of automated model distribution, dynamic resource scheduling, and inference acceleration.

Exploring and Solving Challenges in Multi-Cloud, Multi-Cluster Environments with Karmada

More and more enterprises have to deal with increasingly complex business scenarios, and implementing multi-cluster applications can greatly improve the stability and security of application programs. So, how do you manage multiple Kubernetes clusters simultaneously and avoid vendor lock-in? reduce the additional costs associated with inconsistent application delivery in a multi-cluster environment? unify multi-cluster deployment, cross-cluster traffic governance, and security governance for your application programs?
In this session, we will introduce solve these problems using the Karmada project. You will learn how to achieve consistent application delivery in a multi-cluster scenario, unified deployment of application programs, automatic distribution, automatic scaling and fault migration of application programs, cross-cluster dr. during the tutorial, you leverage its functionality to solve various challenges encountered in actual business scenarios.

Open Source to Enterprise: Scaling LLM/Diffusion Model Inference in Kubernetes

Our session will unveil how Kubernetes-based cloud-native technologies power the transformation of cutting-edge LLMs and diffusion models from lab experiments to massively scalable SaaS services. Key highlights include:
1. Cloud-Native Scaling for AI Inference: Containerized deployment, dynamic scaling, and distributed scheduling on Kubernetes support millions of daily inference requests, with GPU utilization boosted by 40%;
2. Efficiency Breakthroughs in Inference: Through model quantization, distributed parallelism, and caching strategies, we achieved a 60% reduction in LLM inference latency and 35% cost savings for video generation;
3. SaaS Productization Journey: From API design to billing systems, learn how we packaged complex inference technologies into user-friendly services, driving 300% user growth and serving 500+ global enterprise clients;
4. Battle-Tested Solutions: Lessons from multi-model deployment and multi-tenant isolation scenarios, with open-source toolkits and reusable architecture templates for the community.

KCD Hangzhou + OpenInfra Days China 2025 Sessionize Event Upcoming

November 2025 Hangzhou, China

Samzong Lu

PM at DaoCloud, AI/LLMOps PM Leader, CNCF Multiple Project Contributors, Open Source Enthusiast

Shanghai, China

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Speaker

Samzong Lu

Actions

Links

Area of Expertise

Topics

Sessions

LLM-D：面向云原生的大模型部署框架与实践

From Hugging Face to Cloud Native: Igniting the LLM Revolution with Kubernetes and Open Source Tools

Exploring and Solving Challenges in Multi-Cloud, Multi-Cluster Environments with Karmada

Open Source to Enterprise: Scaling LLM/Diffusion Model Inference in Kubernetes

Events

KCD Hangzhou + OpenInfra Days China 2025 Sessionize Event Upcoming

Samzong Lu

Links

Actions