Speaker

Brandon Kang

Brandon Kang

Kubernetes, Cloud Native, Open source, Principal Solutions Architect, Akamai Technologies

Seoul, South Korea

Actions

Brandon Kang is a Principal Technical Solutions Architect at Akamai Technologies, overseeing cloud computing products and cloud-native initiatives across Asia, including Japan, China, India, and Korea.

Before joining Akamai, he held key roles in leading technology companies, including serving as a software engineer at Samsung, a program manager at Microsoft, and a service platform expert at Vingroup in Vietnam.

Brandon is the author of 12 IT books covering software engineering, web performance, DevOps, and cloud computing. As a dedicated Kubestronaut, he is passionate about advancing and advocating for Kubernetes technology.

Area of Expertise

  • Information & Communications Technology

Topics

  • AI
  • Kubernetes
  • Kubernetes Security
  • Cloud Native & Kubernetes
  • Container Management with Docker and Kubernetes
  • Kubernetes troubleshooting
  • kubeflow
  • Machine Learning & AI
  • IoT
  • Gaming
  • ecommerce
  • Azure Kubernetes Services (AKS)
  • Amazon EKS
  • GKE
  • Istio
  • Cilium

Advanced GPU-Orchestrated Workflows and HPC Integrations on K8s for Distributed AI/ML at Scale

As AI/ML workloads continue to scale in complexity, developers and platform engineers are pushing Kubernetes beyond typical MLOps boundaries.

This talk dives into strategies for orchestrating GPU-accelerated training and inference across large-scale clusters -integrating HPC principles, operator-based scheduling, and novel debugging workflows.

Attendees will learn how to implement fine-grained GPU partitioning, harness ephemeral containers to probe and adjust multi-node training in real time, and adopt eBPF-driven instrumentation for low-overhead kernel-level performance insights. We’ll explore cutting-edge scheduling optimizations—like reinforcement-learning approaches and HPC-inspired batch-queuing orchestration on Kubernetes that dynamically respond to heterogeneous job demands.

Real-world case studies will highlight HPC integration scenarios (RDMA, GPU Direct) for data-parallel workloads and complex training frameworks such as Horovod, Ray, and Spark on Kubernetes.

Unlocking the Power of Kubernetes: AI-Driven Innovations for Next-Gen Infrastructure

My session is about dynamic synergy between Kubernetes and AI, unveiling a transformative paradigm shift in modern infrastructure management.

The presentation unveils how Kubernetes serves as an enabler for deploying and scaling AI workloads efficiently, optimizing resource utilization, and ensuring unparalleled scalability.
Delving deeper, it explores the realm of AI-powered automation, showcasing how intelligent algorithms enhance auto-scaling, workload optimization, and predictive maintenance within Kubernetes clusters. Moreover, it sheds light on the crucial aspect of security, elucidating how AI-driven measures bolster threat detection and anomaly identification, fortifying Kubernetes environments against potential risks.

This presentation beckons organizations to embrace the convergence of Kubernetes and AI, unlocking boundless possibilities to redefine infrastructure management and propel towards unprecedented efficiency and resilience.

Brandon Kang

Kubernetes, Cloud Native, Open source, Principal Solutions Architect, Akamai Technologies

Seoul, South Korea

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top