Speaker

Prashant Ramhit

Prashant Ramhit

Mirantis - Snr DevOps & QA

Dubai, United Arab Emirates

Actions

Prashant is a seasoned technologist with over two decades of experience, specializing in cloud-native systems, DevOps, and platform engineering. Currently based in Mauritius, he began his career as a Linux System Administrator in the late 1990s and advanced into roles such as SRE, Golang developer, and later, Product Manager and Project Manager. Prashant has contributed to global-scale systems while working at renowned organizations like the BBC and Netflix. An MSc graduate from Portsmouth University, UK, he combines academic knowledge with practical expertise to design cutting-edge infrastructure solutions.

With extensive experience in leading cross-functional teams, managing complex projects, and aligning technical and business goals, Prashant thrives at the intersection of technology and strategy. Outside of work, he enjoys contributing to open-source communities and cultivating vegetables on his aquaponic farm, embracing the serene island life in Mauritius.

Area of Expertise

  • Information & Communications Technology

Topics

  • Kubernetes
  • Cloud Native & Kubernetes
  • Artificial intellince
  • Machine Learning/Artificial Intelligence
  • Platform Engineering
  • DevOps

Tiny Kubernetes, Big Impact: k0s for Edge Deployments

Monitoring sea algae proliferation and coral growth in real time may seem daunting, but with the right tools, it becomes an exciting edge computing project. Using k0s, the lightweight CNCF-certified Kubernetes distribution, and NATS, the connective technology for edge computing, this project solved the challenges of data collection and processing in a distributed Raspberry Pi cluster.

Leveraging k0s’s minimal resource footprint and automated scaling, paired with NATS’s efficient messaging capabilities, the project enabled real-time sensor data collection and transmission under resource-constrained conditions. Dynamically bootstrapped Raspberry Pi clusters processed data locally while integrating with a central control plane.

Learn about dynamically bootstrapping Raspberry Pi clusters with k0s, managing distributed edge clusters, deploying NATS for scalable messaging, and scaling workloads based on environmental changes. See how k0s and NATS efficiently tackle real-world challenges.

This is a real project built with CNCF apps, based in Mauritius with over 1200 raspberrypi over 1 to 2 kilometers off the cost in the in the ocean with k0s and Nats installed.
Would be grateful if such a project can be showcased so as to show the impact of OSS apps on large scale deployment, which is helping to monitor and rebuild the marine ecosystem

Scaling K8s Everywhere: Introducing k0rdent Your Open Source Platform Engineering Super Controller

In modern platform engineering, managing fleets of Kubernetes clusters across clouds, on-premises datacenters, and edge devices presents operational sprawl, inconsistent tooling, and lock-in challenges. k0rdent is the first fully open-source Distributed Container Management Environment (DCME) that transforms this complexity into a declarative, Kubernetes-native control plane. Platform architects can design and operate developer and workload platforms anywhere, at scale—with zero lock-in and 100% open source. In this session, you’ll learn how k0rdent leverages Kubernetes standards (ClusterAPI, CRDs, GitOps) to provide a single pane of glass for multi-cluster lifecycle management, service composition, and infrastructure automation. We’ll dive into k0rdent’s modular architecture, walk through a live demo of provisioning clusters and deploying services across heterogeneous environments, and explore how the community can contribute to its rapidly growing ecosystem

Solving Distributed AI at the Edge with Deferred Inference Using k0s and cloud GPU Acceleration

Performing sophisticated object detection on constrained edge devices may seem daunting, but with the right design, it becomes a powerful distributed AI solution. Using k0s, the lightweight CNCF-certified distribution, and a deferred inference pipeline powered by YOLOv8, this project tackles the challenges of capturing and processing video frames across heterogeneous environments.
Leveraging k0s’s minimal resource footprint and streamlined orchestration, combined with a cloud GPU inference service, our architecture offloads intensive workloads from edge devices. A Go-based frame capturer with HTTPS protocol reliably transmit video frames to GPU instances for near real-time detection under bandwidth-constrained conditions. A web-based visualization layer then aggregates and displays inference results in real time.
Learn about implementing deferred inference pipelines with YOLOv8, orchestrating containerized workloads using k0s, optimizing GPU utilization for cost efficiency, and achieving low-latency edge processing. See how this architecture brings state-of-the-art computer vision to resource-limited scenarios, opening new possibilities for distributed AI deployments at scale.

From Chaos to Control: Herding GPU-Hungry Dragons with Kubernetes DRA

Balancing AI training, real-time inference, and bursty batch jobs on a shared accelerator fleet used to feel like herding caffeinated dragons. Kubernetes 1.33’s Dynamic Resource Allocation (DRA) turns that chaos into choreography: Pods state exactly what accelerator slice they need and the scheduler guarantees where it will run, long before a container starts. With the new partitionable-device, prioritized-list, and device-taint gates, platform teams carve GPUs, FPGAs, or Smart-NICs on demand—no nvidia-smi incantations, no “GPU not found” crash loops. On GCP we slashed idle GPU hours by 42 %, shrinking datacenter spend while giving each tenant iron-clad isolation via namespace-scoped claims. Dev namespaces grab bite-size slices for rapid prototyping; prod jobs scale to full-fat allocations using the same YAML, zero redeploys. Observability hooks keep SLO dashboards glowing!! One control plane, two QoS tiers, dragons tamed.

Dynamic GPU Autoscaling: Leveraging KServe and NVIDIA DCGM for Cost Efficient scaling

Implementing dynamic GPU autoscaling for deferred inference may seem daunting, but with the right approach, it becomes a powerful way to boost performance while containing costs. By leveraging KServe or KEDA for serverless ML deployment and NVIDIA’s DCGM metrics, this system scales GPU resources in real time based on actual utilization rather than simple request counts. A custom metrics adapter feeds DCGM_FI_DEV_GPU_UTIL data into Kubernetes’ Horizontal Pod Autoscaler (HPA), ensuring GPU capacity matches computational needs. Asynchronous prediction endpoints, coupled with scaling algorithms that factor in memory usage, compute load, and latency, deliver near-optimal resource allocation for complex workloads like object detection. This talk explores the technical steps behind utilization-based autoscaling with KServe or KEDA, including monitoring, alerting, and performance tuning. Real-world benchmarks from production show up to 40% GPU cost savings without compromising inference speed or accuracy. Attendees will learn practical methods for bridging ML frameworks and infrastructure, making cloud GPU-accelerated ML more accessible and efficient in modern cloud-native environments

Solving Distributed AI at the Edge with Deferred Inference Using k0s and cloud GPU Acceleration

Performing sophisticated object detection on constrained edge devices may seem daunting, but with the right design, it becomes a powerful distributed AI solution. Using k0s, the lightweight CNCF-certified distribution, and a deferred inference pipeline powered by YOLOv8, this project tackles the challenges of capturing and processing video frames across heterogeneous environments.
Leveraging k0s’s minimal resource footprint and streamlined orchestration, combined with a cloud GPU inference service, our architecture offloads intensive workloads from edge devices. A Go-based frame capturer with HTTPS protocol reliably transmit video frames to GPU instances for near real-time detection under bandwidth-constrained conditions. A web-based visualization layer then aggregates and displays inference results in real time.
Learn about implementing deferred inference pipelines with YOLOv8, orchestrating containerized workloads using k0s, optimizing GPU utilization for cost efficiency, and achieving low-latency edge processing. See how this architecture brings state-of-the-art computer vision to resource-limited scenarios, opening new possibilities for distributed AI deployments at scale.

Solving Real-World Edge Challenges With k0s, NATS, and Raspberry Pi Clusters

Monitoring sea algae proliferation and coral growth in real time may seem daunting, but with the right tools, it becomes an exciting edge computing project. Using k0s, the lightweight CNCF-certified Kubernetes distribution, and NATS, the connective technology for edge computing, this project solved the challenges of data collection and processing in a distributed Raspberry Pi cluster.

Leveraging k0s’s minimal resource footprint and automated scaling, paired with NATS’s efficient messaging capabilities, the project enabled real-time sensor data collection and transmission under resource-constrained conditions. Dynamically bootstrapped Raspberry Pi clusters processed data locally while integrating with a central control plane.

Learn about dynamically bootstrapping Raspberry Pi clusters with k0s, managing distributed edge clusters, deploying NATS for scalable messaging, and scaling workloads based on environmental changes. See how k0s and NATS efficiently tackle real-world challenges.

KubeCon + CloudNativeCon Europe 2025 Sessionize Event

April 2025 London, United Kingdom

Prashant Ramhit

Mirantis - Snr DevOps & QA

Dubai, United Arab Emirates

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top