© Mapbox, © OpenStreetMap

Speaker

Arya Soni

Arya Soni

DevOps & SRE | Kubernetes & Multi-Cloud Architect (AWS/GCP) | Reduced Cloud Costs by 40% | Infrastructure as Code (Terraform) | CI/CD | MLOps

Gurugram, India

Actions

I’m a DevOps & SRE engineer and the founder of Logify360, an AI-first observability SaaS designed to simplify log analysis with smart, plain-English queries.
My background is in architecting high-availability Kubernetes systems (99.99% uptime at Games Pro) and driving FinOps via observability (40% cost reduction at Cashgrail). I love discussing the Prometheus/Grafana/Loki stack, but I’m currently obsessed with how GenAI can make observability data more accessible. Catch me to chat about K8s, cost optimisation, or the future of 'conversational' logging.

Area of Expertise

  • Information & Communications Technology

Topics

  • DevOps Transformation
  • DevOps Agile Methodology & Culture
  • Migrating to devops
  • Product Management
  • DevOps
  • Cloud & DevOps
  • DevOps & Automation
  • DevOpsCulture
  • DevOps Skills
  • DevOps Journey
  • Entrepreneurship
  • Entrepreneur
  • Cloud Native
  • Cloud Security
  • Cloud Computing
  • Cloud & Infrastructure
  • Cloud Technology
  • Cloud Computig
  • Cloud App Security
  • Cloud Native Infrastructure
  • Cloud Containers and Infrastructure
  • Cloud Automation
  • Cloud strategy
  • Database and Cloud
  • Cloud
  • Cloud Architecture

From 1TB to 1PB: Scaling GDPR-Native Pipelines with Argo Workflows

Scaling data processing from terabytes to petabytes is a technical challenge; doing it while strictly adhering to GDPR is an architectural one. In this session, I will share the evolution of our sovereign data platform, detailing how we transitioned from simple batch jobs to a complex, event-driven architecture that processes petabytes of data without leaving European soil. We will dive into the design of our "Compliance Car Wash", a mandatory ingestion pattern where every data stream is automatically scanned, masked, and anonymised by Argo Workflows before hitting persistent storage. Learn the hard-won lessons of scaling storage, optimising Argo for high throughput, and baking compliance directly into the infrastructure code, ensuring that as your data grows, your risk profile doesn't.

Data Stays Here: Processing Petabytes with Sovereign Argo Pipelines

As organisations scale data processing to petabyte levels, reliance on hyperscaler-managed services often creates data residency risks. In this session, I will demonstrate how to architect high-scale data processing pipelines using Argo Workflows on sovereign infrastructure. We will cover architectural patterns for decoupling storage from compute, managing ephemeral processing nodes without vendor lock-in, and ensuring that sensitive data never leaves your sovereign boundary during transformation. Attendees will leave with a blueprint for building performant, compliant, and completely independent data platforms.

Bulletproof DNS: Achieving High Availability for Node-Local DNS with eBPF

While NodeLocal DNS is fantastic for reducing latency and solving those frustrating conntrack race conditions, ensuring it stays Highly Available (HA) is a challenge. Traditional approaches using `iptables` or `IPVS` often force us into fragile workarounds—requiring secondary nameservers, pod restarts, or external health checkers just to handle a failover.

In this talk, I will demonstrate a cleaner, more robust way to achieve HA using eBPF. We’ll dive into "Enhanced Service Redirection", utilising cgroup eBPF to transparently rewrite `kube-dns`traffic to the NodeLocal cache at the system call level.

Beyond the Monolith: Building Disaggregated LLM Serving Pipelines on K8s

The standard monolithic pattern for deploying LLMs on Kubernetes is hitting a breaking point. As context windows expand, the resource conflict between the compute-intensive "Prefill" phase and the memory-bound "Decode" phase destroys performance and dramatically inflates cloud costs. This session explores the architecture of prefill/decode disaggregation, a method that dismantles the single-pod model in favour of a specialised pipeline where prompt processing and token generation scale independently.

A deep dive into implementing this architecture using LMCache to create a high-speed, shared KV store across your cluster. We will cover the engineering realities of designing separate node pools using multi-tier storage and solving the network physics involved in moving gigabytes of context data between pods faster than an H100 can recompute them. Attendees will gain a blueprint for a persistence layer that achieves 5x throughput gains by reusing computation rather than repeating it.

Navigating the Hybrid and Multi-Cloud Labyrinth with Kubernetes

In "Navigating the Hybrid and Multi-Cloud Labyrinth with Kubernetes," an in-depth discussion of the challenges involved in implementing and overseeing Kubernetes in a variety of cloud settings is provided. We'll discuss the main issues with cloud integration, unified administration, networking, data management, security, and cost optimisation as we investigate integrating Kubernetes with several cloud platforms and on-premises data centres. The objective is to provide participants with tactics for managing cohesively in a variety of situations, with an emphasis on standard deployment and security procedures. We will address networking challenges such as cross-environment communication and service discovery, and we will talk about data management and stateful applications in distributed systems. Along with highlighting security and compliance concerns, the seminar will provide insights into data privacy and identity management. In order to ensure effective resource allocation in hybrid and multi-cloud systems, attendees will master cost management approaches. Professionals looking for in-depth knowledge on Kubernetes management in intricate cloud systems should attend this webinar.

Monitoring Mastery: Advanced Strategies for Canary Deployments with Karpenter

In this session, we will deep dive into the monitoring of canary deployments on Karpenter-managed Kubernetes clusters. First, we will discuss how to utilize Grafana and Prometheus when aiming to visualize or collect canary metrics respectively. We’ll navigate through the simulation of traffic monitoring and canary metrics for gold standard comparisons, measuring things like error rates and traffic splits. Moreover, we will cover the creation of tailored alerts to increase the efficiency and speed of the decision-revert process of a canary given the correct time period is selected. On a slightly different note, we will also cover Karpenter’s dynamic node management and its effects on monitoring providing you with alternative strategies to maintain consistency across node pools. To wrap everything up, we will showcase numerous real life situations to clarify how these tools can assist in making data driven decisions, increasing the application deployment process and reliability of the application.

Data Gravity and Kubernetes: Managing Large-Scale Data Ingest with Minimal Latency

Kubernetes environments, particularly in the context of large-scale data ingest across APIs, suffer from unique challenges posed by data gravity. This presentation aims to explore the newer avenues to overcome these challenges such as local storage layer optimizations, integration of edge computing, and/or network efficiencies that can help reduce latency. Participants will be exposed to ways of reducing data transfer costs, increasing data transfer rates and improving data storage characteristics without loss of scalability of the system. Many of the provided examples will relate to the real situations which will help the audience to use those techniques effectively in the real-life complex Kubernetes environments.

CNCF-hosted Co-located Events Europe 2026 Sessionize Event Upcoming

March 2026 Amsterdam, The Netherlands

devopsdays Atlanta 2025 Sessionize Event

April 2025 Atlanta, Georgia, United States

KubeCon + CloudNativeCon Europe 2025 Sessionize Event

April 2025 London, United Kingdom

KubeHuddle Toronto 2024 Sessionize Event

May 2024 Toronto, Canada

Arya Soni

DevOps & SRE | Kubernetes & Multi-Cloud Architect (AWS/GCP) | Reduced Cloud Costs by 40% | Infrastructure as Code (Terraform) | CI/CD | MLOps

Gurugram, India

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top