Speaker

Prashant Ramhit

Prashant Ramhit

Mirantis Inc. Platform Engineer | Snr DevOps Advocate | OpenSource Dev

Dubai, United Arab Emirates

Actions

I am a seasoned technologist with over two decades of experience, beginning my career in the late 1990s as a Unix System Administrator. Over time, I expanded my expertise into roles such as DevOps Engineer, Site Reliability Engineer (SRE), and Golang Developer, gaining deep hands-on experience with cloud-native systems, infrastructure automation, and platform engineering.

As my journey progressed, I transitioned into technical leadership roles as a Product Manager and Project Manager, where I now combine engineering know-how with strategic execution—leading cross-functional teams, aligning delivery with business goals, and driving innovation across globally distributed systems.

I’ve had the privilege of contributing to impactful projects at organizations such as Zalora (Singapore), Proemion (Germany), GiantSwarm, Viadialog (France), BBC, and Netflix. My academic background includes an MS.c/M.Phil from the University of Portsmouth, UK, which grounds my practical experience in a strong theoretical foundation.

Now based in Mauritius, I continue to work at the intersection of technology and strategy, contributing to open-source communities while enjoying a peaceful island life—tending to my aquaponic farm and recharging through long walks along the beach or hikes in the mountains.

Badges

Area of Expertise

  • Information & Communications Technology

Topics

  • Kubernetes
  • Cloud Native & Kubernetes
  • Machine Learning/Artificial Intelligence
  • Platform Engineering
  • DevOps
  • web3
  • MultiCloud
  • Multimedia
  • Innovations in Stock Market Analytics with AI
  • Artificial Inteligence
  • Immersive Media
  • medical imaging
  • Legal Artifical Intelliegence (AI) Tool

AI at the Edge: ONNX Inference in WASM on Featherweight k0s

This session explores deploying ONNX machine learning models via WebAssembly on lightweight k0s Kubernetes clusters—enabling fast, secure AI inference in resource-constrained environments. The talk demonstrates running pre-trained models from frameworks like PyTorch and TensorFlow through ONNX, executed inside WebAssembly’s sandboxed runtime on minimal Kubernetes.

Topics include:
– ONNX Runtime with WebAssembly on k0s
– Using WASI to access and run models
– Optimizing performance for Wasm-based inference
– Model loading, caching, and scaling strategies
– CI/CD integration for Wasm ML pipelines

Attendees will see real-world benchmarks comparing Wasm-based vs. containerized inference, focusing on latency, memory usage, cold start, and throughput. This approach reduces infrastructure cost, improves isolation, and unlocks edge AI use cases where traditional containers fall short.

Dynamic GPU Autoscaling: Leveraging KServe and NVIDIA DCGM for Cost Efficient scaling

Implementing dynamic GPU autoscaling for deferred inference may seem daunting, but with the right approach, it becomes a powerful way to boost performance while containing costs. By leveraging KServe or KEDA for serverless ML deployment and NVIDIA’s DCGM metrics, this system scales GPU resources in real time based on actual utilization rather than simple request counts. A custom metrics adapter feeds DCGM_FI_DEV_GPU_UTIL data into Kubernetes’ Horizontal Pod Autoscaler (HPA), ensuring GPU capacity matches computational needs. Asynchronous prediction endpoints, coupled with scaling algorithms that factor in memory usage, compute load, and latency, deliver near-optimal resource allocation for complex workloads like object detection. This talk explores the technical steps behind utilization-based autoscaling with KServe or KEDA, including monitoring, alerting, and performance tuning. Real-world benchmarks from production show up to 40% GPU cost savings without compromising inference speed or accuracy. Attendees will learn practical methods for bridging ML frameworks and infrastructure, making cloud GPU-accelerated ML more accessible and efficient in modern cloud-native environments

Autonomous Quality Control: A Policy-Driven Agentic Workflow with MCP and On-Cluster LLMs

As engineering organizations scale, maintaining consistent code quality across hundreds of repositories becomes unmanageable. Traditional CI frameworks distribute governance logic across many repositories, making updates manual, error-prone, and slow.

This technical session presents a new approach: agent-driven autonomous quality enforcement, built using the Model Context Protocol, Kubernetes, and self-hosted LLMs running inside the cluster.

An Agent Service interprets GitHub events, applies enforcement policies, and delegates testing to Context7 which automatically queries a private LLM to generate additional tests. A GitHub MCP server updates the pull request with results and insights, enabling centralized visibility and governance.

This creates a zero-touch developer experience: engineers continue merging code as usual, while the platform autonomously enforces organizational standards, improves software quality, and reduces operational overhead across the entire codebase.

From Chaos to Control: Herding GPU-Hungry Dragons with Kubernetes DRA

Balancing AI training, real-time inference, and bursty batch jobs on a shared accelerator fleet used to feel like herding caffeinated dragons. Kubernetes 1.33’s Dynamic Resource Allocation (DRA) turns that chaos into choreography: Pods state exactly what accelerator slice they need and the scheduler guarantees where it will run, long before a container starts. With the new partitionable-device, prioritized-list, and device-taint gates, platform teams carve GPUs, FPGAs, or Smart-NICs on demand—no nvidia-smi incantations, no “GPU not found” crash loops. On GCP we slashed idle GPU hours by 42 %, shrinking datacenter spend while giving each tenant iron-clad isolation via namespace-scoped claims. Dev namespaces grab bite-size slices for rapid prototyping; prod jobs scale to full-fat allocations using the same YAML, zero redeploys. Observability hooks keep SLO dashboards glowing!! One control plane, two QoS tiers, dragons tamed.

One Template, Zero Touch: Automating Policy Deployment Across Edge Fleets with k0rdent

Managing security policies across distributed edge Kubernetes clusters is challenging - manual deployments, inconsistent enforcement, and compliance drift all hinder scale. This session shows how k0rdent uses template-driven, zero-touch automation to solve these problems. By embedding Kyverno policies into reusable cluster templates, teams can deploy dozens of edge clusters with consistent governance using a single command. Attendees will see a live demo of clusters launching with pre-applied policies, no manual steps required. The talk covers policy inheritance, automated compliance checks, and managing exceptions in specialized environments. Learn how to maintain strict security baselines while reducing operational overhead. Ideal for platform engineers, DevOps teams, and architects, this session offers practical techniques to scale secure edge deployments efficiently, turning complex multi-cluster operations into streamlined workflows.

No More Vendor Lock-In: Build a Multi-Cloud R/Pharma Platform with k0rdent

Vendor lock-in and inconsistent infrastructure across clouds slow down innovation in Pharma, especially for small teams using R, Shiny, or AI tools. This talk introduces k0rdent, an open-source control plane that automates multi-cloud Kubernetes deployments to run R/Pharma workflows seamlessly across AWS, GCP, Azure—or even on-prem. We’ll demo how to deploy Shiny dashboards, RMarkdown reports, and open-source AI models in reproducible, compliant environments using GitOps, policy automation, and zero-touch provisioning. With k0rdent, data science teams can focus on insights, not DevOps—while ensuring portability, security, and scale. Perfect for biotechs, CROs, and academic labs looking to standardize analytics pipelines and future-proof their R ecosystem across cloud providers. We'll also cover real-world use cases, LLM integrations, and how to get started fast.

From MCP to Agent-to-Agent: The Golden Layer With k0s, Istio, and AgentGateway

As AI agents transition from prototypes to production, infrastructure needs shift toward low-latency secure communication, standards-based interoperability (A2A and MCP), and enterprise-ready governance with full observability at scale.

This session presents a practical stack for real-world AI agent deployments, built on three open-source components:

k0s: A lightweight, CNCF-aligned Kubernetes distribution ideal for edge and hybrid setups with minimal ops overhead.

kgateway: A modern, lightweight gateway built for Kubernetes and agentic workloads, fine-grained traffic routing.

AgentGateway: An AI-native data plane purpose-built for agent-to-agent (A2A) and multi-component (MCP) workflows, enabling governance and policy enforcement where API gateways fall short.

Together, they form a golden foundation for scalable, secure, and standards-aligned agent infrastructure. The session covers architecture patterns, lessons learned, and best practices from production AI deployments.

Isolated Intelligence: Running AI Workloads in Secure MicroVMs with k0s and KataContainers

AI platforms in regulated sectors like insurance demand not only performance and scalability, but governance, auditability, and trust built directly into the runtime.

This session explores how k0s, a single-binary k8s distribution, combined with Kata Containers, provides a minimal yet powerful foundation for governed AI workloads.
By embedding isolation, attestation, and policy enforcement at the cluster core, we demonstrate how governance is not an add-on, it's intrinsic.
Each model or inference workload runs inside a micro-VM, ensuring data isolation and verifiable execution and aligning with emerging CNCF AI Conformance principles.

Attendees will learn how to build reproducible, secure AI environments, from data center to edge. where governance starts at the infrastructure layer, not the application tier.

Takeaway: A practical blueprint for secure, compliant, and observable AI in insurance.

From Edge to 5G Core: Deploying AI-Powered Cloud-Native Network Functions on Kubernetes

Telecom networks are evolving into intelligent, cloud-native platforms, but deploying AI close to subscribers remains challenging. This talk presents a practical blueprint for AI-powered Cloud-Native Network Functions (CNFs) running at the 5G edge using K8s and CNCF open-source tooling. We demonstrate how GPU-accelerated inference, event-driven data paths, and GitOps automation enable ultra-low-latency use cases like RAN optimization, anomaly detection, and subscriber analytics.

We share a reference architecture combining:
- k0s for carrier-grade orchestration
- Nephio for declarative CNF automation
- WASM/KataContainers for secure workload isolation

Attendees will learn operational insights for deploying and scaling AI-driven CNFs across distributed edge clusters balancing reliability, sustainability, security, and performance.
This session delivers actionable steps for operators preparing for 6G-era intelligent networks.

Production-Grade AI Isolation: k0s + Kata Containers for Zero-Trust Infrastructure

AI platforms in regulated sectors like insurance demand not only performance and scalability, but governance, auditability, and trust built directly into the runtime.

This session explores how k0s, a single-binary k8s distribution, combined with Kata Containers, provides a minimal yet powerful foundation for governed AI workloads.
By embedding isolation, attestation, and policy enforcement at the cluster core, we demonstrate how governance is not an add-on, it's intrinsic.
Each model or inference workload runs inside a micro-VM, ensuring data isolation and verifiable execution and aligning with emerging CNCF AI Conformance principles.

Attendees will learn how to build reproducible, secure AI environments, from data center to edge. where governance starts at the infrastructure layer, not the application tier.

Takeaway: A practical blueprint for secure, compliant, and observable AI in insurance.

Portable AI by Design: Reproducible Workloads with k0s, KAITO, KitOps and ClusterAPI

Manufacturing is becoming data-driven, connected, and intelligent.

From visual defect detection to predictive maintenance, AI models now run on factory floors, inspection lines, and embedded edge devices.

Yet most of these systems are fragmented each site runs different versions, lacks governance, and struggles to reproduce results when models or infrastructure change.

This session demonstrates how a portable, governed AI platform built with k0s, KAITO, KitOps, and ClusterAPI brings consistency and compliance to industrial AI.

Using k0s for lightweight Kubernetes at the edge, KAITO for model lifecycle automation, KitOps for signed model artifacts, and ClusterAPI for standardized cluster provisioning, we show how factories can securely deploy and update vision and sensor models anywhere: on-prem, in the cloud, or at the edge, with complete traceability and reproducibility.

Outcome: A blueprint for governed, reproducible AI manufacturing at scale.

From Zero to AI Cluster: Instant AI platform and pgvector Provisioning with k0rdent

If you’ve ever tried setting up an AI platform with GPU acceleration and a vector database, you know it’s a complex, time-consuming process. From configuring GPU operators to provisioning PostgreSQL with pgvector, most teams spend days or weeks before running their first AI workload.

But what if you could go from zero to a fully operational, GPU-enabled AI cluster in just minutes? Enter k0rdent—an open platform that simplifies Kubernetes-based infrastructure and provides instant service templates for GPU operators, AI models, and pgvector-enabled PostgreSQL.

In this talk, we’ll walk through how k0rdent automates AI infrastructure provisioning across clusters. You’ll see how to:

Instantly deploy NVIDIA or AMD GPU operators

Provision pgvector-enabled PostgreSQL for AI search and retrieval

Launch an inference service with just a few GitOps commits

By the end, you’ll understand how to transform traditional multi-day setups into a fast, repeatable process that empowers teams to focus on building AI solutions rather than managing infrastructure.

Scaling K8s Everywhere: Introducing k0rdent Your Open Source Platform Engineering Super Controller

In modern platform engineering, managing fleets of Kubernetes clusters across clouds, on-premises datacenters, and edge devices presents operational sprawl, inconsistent tooling, and lock-in challenges. k0rdent is the first fully open-source Distributed Container Management Environment (DCME) that transforms this complexity into a declarative, Kubernetes-native control plane. Platform architects can design and operate developer and workload platforms anywhere, at scale—with zero lock-in and 100% open source. In this session, you’ll learn how k0rdent leverages Kubernetes standards (ClusterAPI, CRDs, GitOps) to provide a single pane of glass for multi-cluster lifecycle management, service composition, and infrastructure automation. We’ll dive into k0rdent’s modular architecture, walk through a live demo of provisioning clusters and deploying services across heterogeneous environments, and explore how the community can contribute to its rapidly growing ecosystem

Security-First AI Platforms: Zero-Trust and Policy Enforcement with k0rdent

Security is often an afterthought in AI infrastructure, leading to data leaks, insecure GPU usage, and unmonitored model access. With AI workloads spanning clouds and clusters, enforcing policies consistently is a major challenge.

In this talk, we’ll demonstrate how k0rdent brings zero-trust principles and policy-based governance to AI platforms. Using Kyverno or OPA integrated via k0rdent’s service templates, we’ll showcase:

We will talks about:

1. Automated Policy Deployment (Kyverno / OPA)
2. Workload Isolation
3. Multi-Tenant AI with Fine-Grained RBAC
4. Continuous Compliance and Auditing

Result:

1. Zero-trust communication leading to no implicit trust between workloads.
2. Consistent security policies deployed across all clusters.
3. GPU usage locked down, preventing resource abuse.
4. AI environments multi-tenant ready with strong RBAC.
5. Compliance status visible in real-time with automatic remediation.

Solving Distributed AI at the Edge with Deferred Inference Using k0s and cloud GPU Acceleration

Performing sophisticated object detection on constrained edge devices may seem daunting, but with the right design, it becomes a powerful distributed AI solution. Using k0s, the lightweight CNCF-certified distribution, and a deferred inference pipeline powered by YOLOv8, this project tackles the challenges of capturing and processing video frames across heterogeneous environments.
Leveraging k0s’s minimal resource footprint and streamlined orchestration, combined with a cloud GPU inference service, our architecture offloads intensive workloads from edge devices. A Go-based frame capturer with HTTPS protocol reliably transmit video frames to GPU instances for near real-time detection under bandwidth-constrained conditions. A web-based visualization layer then aggregates and displays inference results in real time.
Learn about implementing deferred inference pipelines with YOLOv8, orchestrating containerized workloads using k0s, optimizing GPU utilization for cost efficiency, and achieving low-latency edge processing. See how this architecture brings state-of-the-art computer vision to resource-limited scenarios, opening new possibilities for distributed AI deployments at scale.

Solving Real-World Edge Challenges With k0s, NATS, and Raspberry Pi Clusters

Monitoring sea algae proliferation and coral growth in real time may seem daunting, but with the right tools, it becomes an exciting edge computing project. Using k0s, the lightweight CNCF-certified Kubernetes distribution, and NATS, the connective technology for edge computing, this project solved the challenges of data collection and processing in a distributed Raspberry Pi cluster.

Leveraging k0s’s minimal resource footprint and automated scaling, paired with NATS’s efficient messaging capabilities, the project enabled real-time sensor data collection and transmission under resource-constrained conditions. Dynamically bootstrapped Raspberry Pi clusters processed data locally while integrating with a central control plane.

Learn about dynamically bootstrapping Raspberry Pi clusters with k0s, managing distributed edge clusters, deploying NATS for scalable messaging, and scaling workloads based on environmental changes. See how k0s and NATS efficiently tackle real-world challenges.

Solving Distributed AI at the Edge with Deferred Inference Using k0s and cloud GPU Acceleration

Performing sophisticated object detection on constrained edge devices may seem daunting, but with the right design, it becomes a powerful distributed AI solution. Using k0s, the lightweight CNCF-certified distribution, and a deferred inference pipeline powered by YOLOv8, this project tackles the challenges of capturing and processing video frames across heterogeneous environments.
Leveraging k0s’s minimal resource footprint and streamlined orchestration, combined with a cloud GPU inference service, our architecture offloads intensive workloads from edge devices. A Go-based frame capturer with HTTPS protocol reliably transmit video frames to GPU instances for near real-time detection under bandwidth-constrained conditions. A web-based visualization layer then aggregates and displays inference results in real time.
Learn about implementing deferred inference pipelines with YOLOv8, orchestrating containerized workloads using k0s, optimizing GPU utilization for cost efficiency, and achieving low-latency edge processing. See how this architecture brings state-of-the-art computer vision to resource-limited scenarios, opening new possibilities for distributed AI deployments at scale.

Taming the Hydra: Orchestrating Multi-Cloud DeFi with k0rdent

DeFi was built on the promise of decentralization, yet most protocols still rely heavily on centralized cloud infrastructure—creating single points of failure, censorship risk, and operational bottlenecks. This talk presents a radical shift: using k0rdent, an open-source control plane, to deploy DeFi nodes, AI agents, and oracles across a decentralized, multi-cloud, and edge-native infrastructure. Discover how to build fault-tolerant, censorship-resistant networks that span AWS, GCP, Azure, and bare metal—without vendor lock-in. We’ll showcase how GitOps, policy automation, and AI-assisted governance enable self-healing, compliant, and scalable multi-chain DeFi environments—bridging infrastructure and ideals to reach the next wave of global adoption.

Tiny Kubernetes, Big Impact: k0s for Edge Deployments

Monitoring sea algae proliferation and coral growth in real time may seem daunting, but with the right tools, it becomes an exciting edge computing project. Using k0s, the lightweight CNCF-certified Kubernetes distribution, and NATS, the connective technology for edge computing, this project solved the challenges of data collection and processing in a distributed Raspberry Pi cluster.

Leveraging k0s’s minimal resource footprint and automated scaling, paired with NATS’s efficient messaging capabilities, the project enabled real-time sensor data collection and transmission under resource-constrained conditions. Dynamically bootstrapped Raspberry Pi clusters processed data locally while integrating with a central control plane.

Learn about dynamically bootstrapping Raspberry Pi clusters with k0s, managing distributed edge clusters, deploying NATS for scalable messaging, and scaling workloads based on environmental changes. See how k0s and NATS efficiently tackle real-world challenges.

This is a real project built with CNCF apps, based in Mauritius with over 1200 raspberrypi over 1 to 2 kilometers off the cost in the in the ocean with k0s and Nats installed.
Would be grateful if such a project can be showcased so as to show the impact of OSS apps on large scale deployment, which is helping to monitor and rebuild the marine ecosystem

Taming the Starship Kraken: Zero-Downtime Vertical Scaling with Kubernetes 1.33

Juggling CPU and memory allocations for stateful services used to feel like taming a zero-gravity kraken aboard a starship - one tentacle-slip and the whole mission derails. Kubernetes 1.33’s In-Place Pod Vertical Scaling transforms that chaos into a choreographed ballet: you patch your Pod’s resources and the kubelet morphs cgroups on the fly - no eviction, no restart. With the resize subresource and mutable resource fields, platform teams amp up CPU and RAM for heavy inference bursts or dial them back to reclaim idle capacity.

In our tests, we banished 100% of pod restarts for vertical changes, slashed over-provisioned memory by 30%, and achieved sub-500 ms resize operations under warp-speed load. One API call, zero disruptions - warm caches stay toasty like a dragon’s hoard, persistent connections hum like hyperspace drives, and SLOs glow green in the control panel. It’s resource orchestration without risk: a sleek starship crew, not a space circus.

Zero-Touch AI Infrastructure for R in Pharma: k0rdent in Action

Deploying AI and analytics platforms in Pharma shouldn’t take weeks of DevOps setup. This talk introduces how k0rdent, an open-source control plane by Mirantis, automates the provisioning of secure, scalable Kubernetes clusters—perfect for hosting Shiny apps, R/Stan models, and Python/LLM workflows in clinical and research environments. We’ll demo how small teams can spin up reproducible R environments, deploy interactive Shiny dashboards, and run AI/LLM-powered drug discovery workflows—all with minimal effort. Learn how k0rdent simplifies cloud-native infrastructure for Pharma data science, enabling faster experimentation, easier collaboration, and high-trust environments for regulated workflows. Ideal for biotechs, academic labs, and open science efforts looking to scale R use without full DevOps teams.

CNCF-hosted Co-located Events North America 2025 Sessionize Event

November 2025 Atlanta, Georgia, United States

KubeCon + CloudNativeCon Europe 2025 Sessionize Event

April 2025 London, United Kingdom

Prashant Ramhit

Mirantis Inc. Platform Engineer | Snr DevOps Advocate | OpenSource Dev

Dubai, United Arab Emirates

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top