Prashant Ramhit's Speaker Profile @ Sessionize

Zero-Touch AI Infrastructure for R in Pharma: k0rdent in Action

Deploying AI and analytics platforms in Pharma shouldn’t take weeks of DevOps setup. This talk introduces how k0rdent, an open-source control plane by Mirantis, automates the provisioning of secure, scalable Kubernetes clusters—perfect for hosting Shiny apps, R/Stan models, and Python/LLM workflows in clinical and research environments. We’ll demo how small teams can spin up reproducible R environments, deploy interactive Shiny dashboards, and run AI/LLM-powered drug discovery workflows—all with minimal effort. Learn how k0rdent simplifies cloud-native infrastructure for Pharma data science, enabling faster experimentation, easier collaboration, and high-trust environments for regulated workflows. Ideal for biotechs, academic labs, and open science efforts looking to scale R use without full DevOps teams.

No More Vendor Lock-In: Build a Multi-Cloud R/Pharma Platform with k0rdent

Vendor lock-in and inconsistent infrastructure across clouds slow down innovation in Pharma, especially for small teams using R, Shiny, or AI tools. This talk introduces k0rdent, an open-source control plane that automates multi-cloud Kubernetes deployments to run R/Pharma workflows seamlessly across AWS, GCP, Azure—or even on-prem. We’ll demo how to deploy Shiny dashboards, RMarkdown reports, and open-source AI models in reproducible, compliant environments using GitOps, policy automation, and zero-touch provisioning. With k0rdent, data science teams can focus on insights, not DevOps—while ensuring portability, security, and scale. Perfect for biotechs, CROs, and academic labs looking to standardize analytics pipelines and future-proof their R ecosystem across cloud providers. We'll also cover real-world use cases, LLM integrations, and how to get started fast.

Taming the Hydra: Orchestrating Multi-Cloud DeFi with k0rdent

DeFi was built on the promise of decentralization, yet most protocols still rely heavily on centralized cloud infrastructure—creating single points of failure, censorship risk, and operational bottlenecks. This talk presents a radical shift: using k0rdent, an open-source control plane, to deploy DeFi nodes, AI agents, and oracles across a decentralized, multi-cloud, and edge-native infrastructure. Discover how to build fault-tolerant, censorship-resistant networks that span AWS, GCP, Azure, and bare metal—without vendor lock-in. We’ll showcase how GitOps, policy automation, and AI-assisted governance enable self-healing, compliant, and scalable multi-chain DeFi environments—bridging infrastructure and ideals to reach the next wave of global adoption.

One Template, Zero Touch: Automating Policy Deployment Across Edge Fleets with k0rdent

Managing security policies across distributed edge Kubernetes clusters is challenging - manual deployments, inconsistent enforcement, and compliance drift all hinder scale. This session shows how k0rdent uses template-driven, zero-touch automation to solve these problems. By embedding Kyverno policies into reusable cluster templates, teams can deploy dozens of edge clusters with consistent governance using a single command. Attendees will see a live demo of clusters launching with pre-applied policies, no manual steps required. The talk covers policy inheritance, automated compliance checks, and managing exceptions in specialized environments. Learn how to maintain strict security baselines while reducing operational overhead. Ideal for platform engineers, DevOps teams, and architects, this session offers practical techniques to scale secure edge deployments efficiently, turning complex multi-cluster operations into streamlined workflows.

AI at the Edge: ONNX Inference in WASM on Featherweight k0s

This session explores deploying ONNX machine learning models via WebAssembly on lightweight k0s Kubernetes clusters—enabling fast, secure AI inference in resource-constrained environments. The talk demonstrates running pre-trained models from frameworks like PyTorch and TensorFlow through ONNX, executed inside WebAssembly’s sandboxed runtime on minimal Kubernetes.

Topics include:
– ONNX Runtime with WebAssembly on k0s
– Using WASI to access and run models
– Optimizing performance for Wasm-based inference
– Model loading, caching, and scaling strategies
– CI/CD integration for Wasm ML pipelines

Attendees will see real-world benchmarks comparing Wasm-based vs. containerized inference, focusing on latency, memory usage, cold start, and throughput. This approach reduces infrastructure cost, improves isolation, and unlocks edge AI use cases where traditional containers fall short.

Solving Real-World Edge Challenges With k0s, NATS, and Raspberry Pi Clusters

Monitoring sea algae proliferation and coral growth in real time may seem daunting, but with the right tools, it becomes an exciting edge computing project. Using k0s, the lightweight CNCF-certified Kubernetes distribution, and NATS, the connective technology for edge computing, this project solved the challenges of data collection and processing in a distributed Raspberry Pi cluster.

Leveraging k0s’s minimal resource footprint and automated scaling, paired with NATS’s efficient messaging capabilities, the project enabled real-time sensor data collection and transmission under resource-constrained conditions. Dynamically bootstrapped Raspberry Pi clusters processed data locally while integrating with a central control plane.

Learn about dynamically bootstrapping Raspberry Pi clusters with k0s, managing distributed edge clusters, deploying NATS for scalable messaging, and scaling workloads based on environmental changes. See how k0s and NATS efficiently tackle real-world challenges.

Tiny Kubernetes, Big Impact: k0s for Edge Deployments

Monitoring sea algae proliferation and coral growth in real time may seem daunting, but with the right tools, it becomes an exciting edge computing project. Using k0s, the lightweight CNCF-certified Kubernetes distribution, and NATS, the connective technology for edge computing, this project solved the challenges of data collection and processing in a distributed Raspberry Pi cluster.

Leveraging k0s’s minimal resource footprint and automated scaling, paired with NATS’s efficient messaging capabilities, the project enabled real-time sensor data collection and transmission under resource-constrained conditions. Dynamically bootstrapped Raspberry Pi clusters processed data locally while integrating with a central control plane.

Learn about dynamically bootstrapping Raspberry Pi clusters with k0s, managing distributed edge clusters, deploying NATS for scalable messaging, and scaling workloads based on environmental changes. See how k0s and NATS efficiently tackle real-world challenges.

This is a real project built with CNCF apps, based in Mauritius with over 1200 raspberrypi over 1 to 2 kilometers off the cost in the in the ocean with k0s and Nats installed.
Would be grateful if such a project can be showcased so as to show the impact of OSS apps on large scale deployment, which is helping to monitor and rebuild the marine ecosystem

Solving Distributed AI at the Edge with Deferred Inference Using k0s and cloud GPU Acceleration

Performing sophisticated object detection on constrained edge devices may seem daunting, but with the right design, it becomes a powerful distributed AI solution. Using k0s, the lightweight CNCF-certified distribution, and a deferred inference pipeline powered by YOLOv8, this project tackles the challenges of capturing and processing video frames across heterogeneous environments.
Leveraging k0s’s minimal resource footprint and streamlined orchestration, combined with a cloud GPU inference service, our architecture offloads intensive workloads from edge devices. A Go-based frame capturer with HTTPS protocol reliably transmit video frames to GPU instances for near real-time detection under bandwidth-constrained conditions. A web-based visualization layer then aggregates and displays inference results in real time.
Learn about implementing deferred inference pipelines with YOLOv8, orchestrating containerized workloads using k0s, optimizing GPU utilization for cost efficiency, and achieving low-latency edge processing. See how this architecture brings state-of-the-art computer vision to resource-limited scenarios, opening new possibilities for distributed AI deployments at scale.

Solving Distributed AI at the Edge with Deferred Inference Using k0s and cloud GPU Acceleration

Performing sophisticated object detection on constrained edge devices may seem daunting, but with the right design, it becomes a powerful distributed AI solution. Using k0s, the lightweight CNCF-certified distribution, and a deferred inference pipeline powered by YOLOv8, this project tackles the challenges of capturing and processing video frames across heterogeneous environments.
Leveraging k0s’s minimal resource footprint and streamlined orchestration, combined with a cloud GPU inference service, our architecture offloads intensive workloads from edge devices. A Go-based frame capturer with HTTPS protocol reliably transmit video frames to GPU instances for near real-time detection under bandwidth-constrained conditions. A web-based visualization layer then aggregates and displays inference results in real time.
Learn about implementing deferred inference pipelines with YOLOv8, orchestrating containerized workloads using k0s, optimizing GPU utilization for cost efficiency, and achieving low-latency edge processing. See how this architecture brings state-of-the-art computer vision to resource-limited scenarios, opening new possibilities for distributed AI deployments at scale.

Dynamic GPU Autoscaling: Leveraging KServe and NVIDIA DCGM for Cost Efficient scaling

Implementing dynamic GPU autoscaling for deferred inference may seem daunting, but with the right approach, it becomes a powerful way to boost performance while containing costs. By leveraging KServe or KEDA for serverless ML deployment and NVIDIA’s DCGM metrics, this system scales GPU resources in real time based on actual utilization rather than simple request counts. A custom metrics adapter feeds DCGM_FI_DEV_GPU_UTIL data into Kubernetes’ Horizontal Pod Autoscaler (HPA), ensuring GPU capacity matches computational needs. Asynchronous prediction endpoints, coupled with scaling algorithms that factor in memory usage, compute load, and latency, deliver near-optimal resource allocation for complex workloads like object detection. This talk explores the technical steps behind utilization-based autoscaling with KServe or KEDA, including monitoring, alerting, and performance tuning. Real-world benchmarks from production show up to 40% GPU cost savings without compromising inference speed or accuracy. Attendees will learn practical methods for bridging ML frameworks and infrastructure, making cloud GPU-accelerated ML more accessible and efficient in modern cloud-native environments

Scaling K8s Everywhere: Introducing k0rdent Your Open Source Platform Engineering Super Controller

In modern platform engineering, managing fleets of Kubernetes clusters across clouds, on-premises datacenters, and edge devices presents operational sprawl, inconsistent tooling, and lock-in challenges. k0rdent is the first fully open-source Distributed Container Management Environment (DCME) that transforms this complexity into a declarative, Kubernetes-native control plane. Platform architects can design and operate developer and workload platforms anywhere, at scale—with zero lock-in and 100% open source. In this session, you’ll learn how k0rdent leverages Kubernetes standards (ClusterAPI, CRDs, GitOps) to provide a single pane of glass for multi-cluster lifecycle management, service composition, and infrastructure automation. We’ll dive into k0rdent’s modular architecture, walk through a live demo of provisioning clusters and deploying services across heterogeneous environments, and explore how the community can contribute to its rapidly growing ecosystem

Edge-Native Agentic AI: Simplify and Scale Agent Deployments with k0rdent

Agentic AI is changing how developers and startups build intelligent systems—especially in edge environments where low-latency, real-world data is critical. However, deploying and scaling agent-based applications on distributed edge infrastructure remains a challenge for many early-stage teams.

This session introduces k0rdent, an open-source extension to Kubernetes that simplifies the deployment of agentic AI workloads at the edge. We'll walk through the basics of Kubernetes-based infrastructure for agent frameworks like LangChain, Autogen, and Semantic Kernel, and show how k0rdent bridges the gap between developer-friendly agent workflows and real-world edge deployments.

Attendees will see how to:

Set up an edge-native cluster with GPU support in minutes.

Run agent-based AI inference tasks (e.g., computer vision, chat interfaces) on these clusters.

Connect edge workloads to real-time event streams (like NATS or MQTT) for fast feedback loops.

Automate lifecycle management—crucial for startups with small DevOps teams.

We'll wrap up with tips and best practices for early-stage teams building their first agent-driven edge workflows, including safety and testing basics for AI deployments.

Whether you're prototyping your first agent or looking to scale to production, join us to explore how k0rdent can help you go from idea to edge-native deployment faster.

Outline Aligned with Topics of Interest
1. Getting Started with Agentic AI and the Edge

Define agentic AI for edge scenarios.

Challenges of real-world deployment (latency, data sovereignty, updates).

2. Intro to Agent Frameworks and k0rdent

Brief overview of frameworks (LangChain, Semantic Kernel, Autogen).

How k0rdent simplifies running these frameworks on edge clusters.

Basic cluster creation workflow (developer-friendly).

3. Real-World Applications

Show real examples (e.g., computer vision agents, autonomous retail agents) - with demo of a production application

How edge GPU acceleration helps agentic tasks.

4. Human-Agent Interfaces & Experience Design

Practical UX considerations for edge agents (e.g., low-latency feedback).

How to connect with frontend/mobile/web experiences.

5. Responsible AI and Testing Basics

How to test, update, and monitor edge AI agents safely.

Basics of versioning, rollback, and resource management with k0rdent.

6. Developer Experience and Best Practices

Early learnings from deploying edge-native agents.

Tips for small teams: avoid over-engineering, automate updates.

7. Live Demo or Visual Walkthrough

Show the agent deployment workflow on k0rdent.

Highlight NATS data flows for real-time agent feedback.

Taming the Starship Kraken: Zero-Downtime Vertical Scaling with Kubernetes 1.33

Juggling CPU and memory allocations for stateful services used to feel like taming a zero-gravity kraken aboard a starship - one tentacle-slip and the whole mission derails. Kubernetes 1.33’s In-Place Pod Vertical Scaling transforms that chaos into a choreographed ballet: you patch your Pod’s resources and the kubelet morphs cgroups on the fly - no eviction, no restart. With the resize subresource and mutable resource fields, platform teams amp up CPU and RAM for heavy inference bursts or dial them back to reclaim idle capacity.

In our tests, we banished 100% of pod restarts for vertical changes, slashed over-provisioned memory by 30%, and achieved sub-500 ms resize operations under warp-speed load. One API call, zero disruptions - warm caches stay toasty like a dragon’s hoard, persistent connections hum like hyperspace drives, and SLOs glow green in the control panel. It’s resource orchestration without risk: a sleek starship crew, not a space circus.

From Chaos to Control: Herding GPU-Hungry Dragons with Kubernetes DRA

Balancing AI training, real-time inference, and bursty batch jobs on a shared accelerator fleet used to feel like herding caffeinated dragons. Kubernetes 1.33’s Dynamic Resource Allocation (DRA) turns that chaos into choreography: Pods state exactly what accelerator slice they need and the scheduler guarantees where it will run, long before a container starts. With the new partitionable-device, prioritized-list, and device-taint gates, platform teams carve GPUs, FPGAs, or Smart-NICs on demand—no nvidia-smi incantations, no “GPU not found” crash loops. On GCP we slashed idle GPU hours by 42 %, shrinking datacenter spend while giving each tenant iron-clad isolation via namespace-scoped claims. Dev namespaces grab bite-size slices for rapid prototyping; prod jobs scale to full-fat allocations using the same YAML, zero redeploys. Observability hooks keep SLO dashboards glowing!! One control plane, two QoS tiers, dragons tamed.

Speaker

Prashant Ramhit

Actions

Links

Area of Expertise

Topics

Sessions

Zero-Touch AI Infrastructure for R in Pharma: k0rdent in Action

No More Vendor Lock-In: Build a Multi-Cloud R/Pharma Platform with k0rdent

Taming the Hydra: Orchestrating Multi-Cloud DeFi with k0rdent

One Template, Zero Touch: Automating Policy Deployment Across Edge Fleets with k0rdent

AI at the Edge: ONNX Inference in WASM on Featherweight k0s

Solving Real-World Edge Challenges With k0s, NATS, and Raspberry Pi Clusters

Tiny Kubernetes, Big Impact: k0s for Edge Deployments

Solving Distributed AI at the Edge with Deferred Inference Using k0s and cloud GPU Acceleration

Solving Distributed AI at the Edge with Deferred Inference Using k0s and cloud GPU Acceleration

Dynamic GPU Autoscaling: Leveraging KServe and NVIDIA DCGM for Cost Efficient scaling

Scaling K8s Everywhere: Introducing k0rdent Your Open Source Platform Engineering Super Controller

Edge-Native Agentic AI: Simplify and Scale Agent Deployments with k0rdent

Taming the Starship Kraken: Zero-Downtime Vertical Scaling with Kubernetes 1.33

From Chaos to Control: Herding GPU-Hungry Dragons with Kubernetes DRA

Events

KubeCon + CloudNativeCon Europe 2025 Sessionize Event

Prashant Ramhit

Links

Actions