Session

Maximum-Security Clusters: Solving the ValidatingAdmissionPolicy Fail-Closed Dilemma

ValidatingAdmissionPolicy (VAP) gives us a declarative way to enforce security in Kubernetes. But when we tried to run the policies in fail-closed mode we ran into a problem we didn’t expect.

VAP lets you reference external parameter objects in its policies, which is exactly what individual governance teams need. Exemption lists, approved registries, namespace allow-lists all change frequently and shouldn’t require updating the policy definition itself.

The implementation of this separation of concerns introduces a difficult choice on what happens if the referenced resource is not present.
The options are:
- Deny: block every workload, or
- Allow: skip policy validation.

Neither option is acceptable. Deny risks cluster-wide outages. Allow silently weakens security.

To safely run in fail-closed mode, we had to guarantee that none of the referenced parameter objects are ever absent. The real challenge wasn’t writing the policy — it was solving the lifecycle and distribution problem behind those references. Without that, fail-closed enforcement simply isn’t safe.

In this talk, I’ll share how we approached this challenge in a global enterprise Kubernetes platform. We'll dive deep into the operator we built to continuously reconcile governance-owned parameter objects into each cluster, enabling fail-closed enforcement without risking widespread disruption.

More importantly, I’ll focus on the patterns behind the solution:

- Designing VAP policies with clean separation of concerns
- Managing parameter object lifecycle safely at scale
- Preventing outages caused by missing cross-resource references
- Extending VAP with templating and multi-object targeting

If you’re trying to move from “best-effort” policy enforcement to a reliable fail-closed security implementation with a distributed governance model, this session will give you practical patterns you can apply in any environment.

First public delivery: This talk has not been presented at any previous conference or event.

Session format: 30-minute session presentation with two speakers (both from ServiceNow, platform engineering team). We will split the talk between the problem space (VAP's paramRef fail-closed dilemma) and the solution (operator architecture and live demo).

Target audience: Intermediate to advanced Kubernetes practitioners — platform engineers, security engineers, and cluster operators who are evaluating or already implementing ValidatingAdmissionPolicy for policy enforcement. Familiarity with Kubernetes admission control concepts is assumed.

Production context: The patterns presented are running in production across 50+ clusters spanning multiple regions in a global enterprise Kubernetes platform. This is a real-world case study, not a proof of concept.

Live demo included: We will include a live demonstration showing the operator reconciling parameter objects and the fail-closed enforcement behavior when references are present vs. absent.

Open source: The operator discussed in this talk is being open-sourced ahead of the event. Regardless of that, the talk focuses on reusable patterns that are universally applicable.

Suggested tracks: Security or Platform Engineering.

Felipe Alves

Senior Staff Software Engineer at ServiceNow

Athlone, Ireland

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top