© Mapbox, © OpenStreetMap

Speaker

Michael Forrester

Michael Forrester

Preparing Tomorrow's Innovators, Elevating the Average

Atlanta, Georgia, United States

Actions

Michael Forrester is a student, explorer, and educator at the boundary between humanity and technology. Over 25+ years he's gone from CTO to IC across operations, AI, cloud, and platform engineering, including time at AWS, ThoughtWorks, Red Hat, and Honeywell. His training has reached over a million engineers. He speaks at KubeCon and CNCF on Claude Code, MCP, and AI safety for platform engineers. Tools don't transform organizations. People do.

Area of Expertise

  • Information & Communications Technology
  • Region & Country

Topics

  • AWS Certifications
  • CNCF Certifications
  • Education
  • AIOps
  • DevOps
  • Platform Engineering
  • High-Performing Organizations
  • Anthropic Claude
  • Claude Code
  • Claude
  • Anthropic

Your IDP Is 80% of the Agent Governance Layer You Need

If you've built an Internal Developer Platform with GitOps, admission control, runtime detection, and observability, you already have most of what governs AI agents in production. Kyverno validates manifests no matter who generated them. Falco detects unexpected runtime behavior no matter what triggered it. ArgoCD enforces Git-only delivery no matter who committed. The remaining 20% is the part the industry is racing to build: agent identity, LLM input/output sanitization, and MCP tool-call governance. This talk maps the existing CNCF platform engineering stack to the emerging agent governance problem, names the specific gaps, and points to the projects filling them.

The Spec Is the Skill: Why CLAUDE.md and agents.md Are the New Architecture Artifacts

AI coding agents read constraint documents before they write code. Claude Code reads CLAUDE.md. Block's Goose reads agents.md. Both are now Linux Foundation projects under the Agentic AI Foundation. I ran Kubernetes infrastructure builds with progressively detailed spec files and measured the difference. A prompt with no spec produced scattered output needing constant human intervention. A structured spec with build phases, dependency maps, and completion gates produced significantly better results. This talk introduces spec authorship as a cloud native engineering practice. These files belong in your repo alongside ADRs, Helm values, and Kyverno policies. Five real spec files are Apache 2.0 on GitHub.

From Kueue to Volcano: A 1st time journey for GPU-based workloads

A camp-fire tale for K8s platform engineers tackling GPU workloads.

🪵 Honeymoon – Kueue powers our clusters. Single-GPU camera jobs sail through; dashboards glow green.

🔥 Plot Twist – LLM training lands. Each job wants eight GPUs together. Queues stall, messages erupt: “Why is nothing finishing?”

🔎 Detective Work – With plain kubectl and the default dashboard we find Kueue granting half the GPUs, pods wait forever, 40 % of GPU cards idle.

🧬 Evolution, not Re-write – We drop in Volcano, turn on gang scheduling, keep the same Job YAML, and the backlog melts.

🎒 Take-away Kit – A quick checklist to know when to stay on Kueue and when to switch, copy-paste PromQL to spot starvation, and a GitHub lab to repeat the migration on a weekend rig.

No vendor tools—just Kubernetes, YAML, and the journey from “it kind of works” to “it really works.”

Build an IDP, Then Break It with an Agent: A Live Governance Stress Test

Two acts, one cluster. First, we build an Internal Developer Platform live using an AI coding agent: ArgoCD for GitOps, Kyverno for policy, Prometheus and Grafana for observability. Then we stress test it. We point the agent at the platform and watch what the governance stack catches and what slips through. Kyverno blocks non-compliant manifests regardless of who generated them. Falco detects unexpected runtime behavior. ArgoCD rejects anything that didn't flow through Git. But the agent prompt that smuggles bad commands past the admission webhook? The output that leaks data because nothing sanitized the response? Those are the gaps. Attendees build the platform, attack it, and leave with a concrete map of what their existing CNCF stack governs and where agent-specific tooling is still needed. We will talk about KGateways role in helping with this layer and other suggestions that will help.

The Auditor Who Had Nothing Left To Ask: GitOps and Runtime Security for Sovereign Compliance

Let us tell you about the time an auditor absolutely destroyed us.

"Show me your access logs." Three systems, none talking to each other.

"Prove this deployment was approved." We Slacked around for 20 minutes while they watched.

"What happened at 2 AM last Tuesday?" No idea.

They gave us a second chance.

After two weeks digging through logs like archaeologists, we fixed it. Not with some expensive GRC platform..with ArgoCD and Falco, two CNCF Graduated projects. We wrote rules that tag detections to NIS2, DORA, SOC2. We made Git our audit trail. We built workflows that captured evidence before anyone asked.

Next time the auditor showed up, they were slightly wowed. Most questions they asked, the system had already answered.

This talk is that story: the disaster, the fix, and a live demo where I break things on purpose so you can watch continuous compliance in action. You'll leave with working code, compliance-mapped Falco rules, and an architecture that worked for us.

6 Autoscalers in 6 Months: A Kubernetes Scaling Horror Story

It started innocently. We should add auto scaling. This is crazy. We're doing manual scaling. Well....

Six months later, we had deployed HPA, Cluster Autoscaler, Karpenter, KEDA, VPA, AND Kueue. Each one solved a problem and created three more. This is that story: the dumb failures, the "why didn't we read the docs" moments, and the gotchas that only show up in production.
5 minutes. 6 autoscalers. A whole lot of regret. An object lesson in what not to do when dealing with auto scalers, even as we evolved into AI workloads
I'll speed-run through each autoscaler in the order we actually adopted them... HPA + Cluster Autoscaler together on day one, Karpenter when CA was too slow, KEDA when CPU metrics failed us, VPA when we finally admitted our resource requests were fiction (because why not), and Kueue when AI training jobs started fighting each other.
Come laugh at our pain. Leave knowing which autoscaler you actually need or at least which one you want to avoid, and which gotchas will bite you.

The Day Claude Code Deleted My Cluster: A Cautionary Tale About AI Guardrails

"You have full access to the pipeline. Do what you need to do."
Famous last words.
In this lightning talk, I'll share the hilarious (and horrifying) story of what happened when I gave Claude Code full pipeline access and stepped away for 30 seconds. When I came back it had not only completely wrecked the Kubernetes cluster but not even two troubleshooting systems sessions later it wrecked almost every network card in the set of Linux systems.
This is a story about nondeterministic systems, the illusion of AI understanding, and why "the AI knows what it's doing" is the most dangerous phrase in modern DevOps. I'll share the actual troubleshooting spiral that escalated from "let me help" to "I've destroyed your cluster and systems," and the guardrails I now enforce religiously.
5 minutes. 20 slides. And one very hilarious probably very blameful post-mortem with Claude Code afterwards. One very expensive lesson about trusting AI agents with infrastructure access even if it was just for a short while.
Come for the disaster. Stay for the wisdom.

The 90-Minute IDP: AI Ate My Implementation. Let's Build a Platform Together and Score What's Left.

I've already built this IDP end-to-end with Claude Code. ArgoCD, Kyverno, Falco, OpenTelemetry, Backstage — the whole stack, from an empty cluster to a production-grade platform. I know exactly where the AI crushes it, where it faceplants, and where it gets dangerously close to something brilliant before going sideways. It gets weird. It also produces something awesome.

Now I'm doing it again, live, in front of you.

In this workshop, I'll hand Claude Code a build spec with test-driven gates and let it rip through building a complete Internal Developer Platform in real time. We'll provide lightweight lab environments so you can build alongside me — just bring your own Claude Code access. Got your own cluster? Bring that too.
Here's what I've learned: AI didn't just speed up implementation. It ate most of it. The Terraform modules, the Helm values, the boilerplate YAML — Claude Code handles that faster than any human. But what's left is the hard stuff: integration between systems, sync wave ordering, architecture, business context, policy conflicts, and the judgment calls that turn a pile of tools into a platform. Every component (k8s, argocd, prometheus) gets scored on a live scorecard so you can see exactly what AI ate and what it choked on. Oh and we scorecard not just the installation, but hte integration and the platform usability as well.

The implementation layer is supposedly disappearing. Let's find out what replaces it.

AI Assisted Hands-On Learning - the Future of Education

In an era where Cloud Native skills are in high demand, traditional learning methods often fall short. This session explores the transformative power of AI-assisted hands-on learning in revolutionizing Cloud Native education. We'll delve into how interactive labs have enabled us at KodeKloud to effectively teach millions of students, equipping them with practical skills in Kubernetes, Docker, Ansible, and more. Discover how AI-driven interactive platforms are shaping the future of education by providing personalized, immersive, and scalable learning experiences. Join us to uncover best practices and innovative techniques that can elevate your approach to Cloud Native training, ensuring your learners are prepared for tomorrow's challenges.

KCD Texas 2026 Sessionize Event

May 2026 Austin, Texas, United States

devopsdays Atlanta 2026 Sessionize Event

April 2026 Atlanta, Georgia, United States

CNCF-hosted Co-located Events North America 2024 Sessionize Event

November 2024 Salt Lake City, Utah, United States

Michael Forrester

Preparing Tomorrow's Innovators, Elevating the Average

Atlanta, Georgia, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top