Speaker

Amit Kumar

Amit Kumar

Senior Platform and DevOps Engineer

Chandigarh, India

Actions

I am a Senior Platform and DevOps Engineer with 5+ years of experience, specializing in Kubernetes, cloud-native architectures, and AI/ML infrastructure. I have implemented scalable, automated, and resilient systems using Terraform, ArgoCD, Crossplane, and Kubernetes. My expertise includes GitOps, CI/CD, AI/ML pipelines, and security best practices. Passionate about automation and cloud-native solutions, I focus on disaster recovery and AI/ML platform engineering.

Area of Expertise

  • Information & Communications Technology
  • Transports & Logistics

Topics

  • AI and Cybersecurity
  • Artificial Intelligence (AI)
  • AI & ML Solutions
  • Container and Kubernetes security

OpsAI: Incident Investigation, Reimagined with AI Agents

Every incident follows the same pattern. Alerts fire, you open four terminals, correlate logs with recent deployments, check cluster state, dig through git history, and slowly piece together what went wrong. The tools are good. The process is exhausting.

OpsAI is a multi-agent AI system we built to tackle this. It investigates incidents by pulling evidence from logs, Kubernetes state, and git repositories, then produces answers where every claim is tied to a real source. No hallucinated pod names. No invented timelines. Every assertion points back to a log line, a commit, or a cluster object.

This talk covers what we learned building it in production: why evidence citation has to be an architectural constraint and not an afterthought, how we structured git, Loki, and Kubernetes snapshots as complementary evidence layers, how multi-agent coordination works when sub-questions need different specialists, and why running AI inference workloads on Kubernetes is a different class of operational problem than most teams expect.

The goal is to share a concrete architecture pattern that others can apply, and be honest about where it breaks down.

Automating Disaster Recovery: A GitOps-Driven Approach to Resilient Infrastructure

Disaster recovery (DR) in cloud environments is often complex and manual. In this talk, we’ll demonstrate a fully automated GitOps-driven disaster recovery solution using Terraform, ArgoCD, and Crossplane. We will walk through:

- IaC to provision an Amazon EKS cluster in a secondary region and automatically bootstrap ArgoCD.
- GitOps Automation to continuously manage both infrastructure and applications.
- Crossplane for Cloud Resources to dynamically provision cloud resources from Kubernetes.
- Application Resilience: Using ArgoCD to deploy Helm charts for applications, ensuring a seamless recovery process.

By the end of this session, attendees will understand how to orchestrate disaster recovery using Kubernetes-native tooling, eliminating manual failover processes while ensuring cloud resilience. Whether you’re working in regulated environments or just looking for a scalable DR strategy, this session will provide actionable insights and a reproducible framework.

CNCG Chandigarh Meetup User group Sessionize Event

May 2026 Chandigarh, India

Amit Kumar

Senior Platform and DevOps Engineer

Chandigarh, India

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top