Speaker

Carlos Sanchez

Carlos Sanchez

Principal Scientist at Adobe

A Coruña, Spain

Carlos Sanchez is a Principal Scientist at Adobe Experience Manager, specializing in software automation, from build tools to Continuous Delivery and Progressive Delivery. Involved in Open Source for over 15 years, he is the author of the Jenkins Kubernetes plugin and a member of the Apache Software Foundation amongst other open source groups, contributing to several projects, such as Jenkins or Apache Maven.

Area of Expertise

  • Information & Communications Technology

Topics

  • Kubernetes
  • DevOps
  • progressive delivery

We Moved one Java Product to Kubernetes and This Is What We Learned

Join us to learn the challenges we faced and the solutions we implemented moving Adobe Experience Manager, an existing product built on top of many OSS projects, to a Cloud Native environment implemented around Kubernetes.

Moving to a Cloud Native architecture required changes in culture, processes and technologies. Some functionality was decomposed and partially reimplemented as Cloud Native services. The need to scale triggered a microservice architecture to support the existing app, and to scale the development organization. This drove the creation of services that work together to provide the full product. The multi tenancy aspects require considering the isolation between tenants, at multiple levels of the stack.

We will dig into specific details that require attention when migrating to a Cloud Native environment, like resource management, decomposition of services or availability amongst others, sharing also when we took the wrong decisions.

Optimizing Resource Usage in Kubernetes

Moving to Kubernetes opens the door to a world of possibilities, the amount of workloads that can be run and the flexibility it provides. However this comes at a cost on managing the resources used by many applications and teams. Java applications can be specially challenging when running in containers.

At Adobe Experience Manager we run our cloud service on more than 40 clusters. We make extensive use of standard Kubernetes capabilities to reduce resource usage and we have also built some solutions at several levels of the stack to improve it.

From autoscaling to workload hibernation, from automated resource requests to Kubernetes Jobs, we have experimented with and implemented several features that decrease our resource usage and lower the cost of running many Kubernetes clusters at scale. Both at workload resource level and also at achieving higher density clusters that reduce the number of clusters we need and the operating costs.

Progressive Delivery in Kubernetes

Progressive Delivery makes it easier to adopt Continuous Delivery, by deploying new versions to a subset of users before rolling them to the totality of the users, and rolling them back if not matching some key metrics, using techniques like feature flags and canary deployments.

For workloads running on Kubernetes it is very easy to adopt Progressive Delivery using Argo Rollouts. At Adobe Experience Manager we deploy over 10k customer services to Kubernetes. Changes can occur multiple times per day both internal and from code. A new feature can work fine for 99% of customers but still affect the other 1%, and detecting this just from tests is costly.

We will show how to implement a Progressive Delivery pipeline with Argo Rollouts to improve the reliability of the service and prevent regressions. It allows the protection of the service and automation of roll backs to a previous version if needed. This allows for faster delivery with more confidence.

Debugging Envoy Tunnels

Envoy is a powerful proxy that can be used to connect services in a secure way, with encryption and mutual authentication, using certificates. It can be used to connect services running in Kubernetes with external infrastructure, such as on-premise services, databases, etc.

At Adobe Experience Manager Cloud Service we use Envoy to connect pods running in Kubernetes with customer dedicated infrastructure, such as on-premise services, databases, etc. This allows, for example, different pods to have their own dedicated egress ip instead of the cluster's, or connections from pods to multiple customer on-premise services using VPN.

We will show how we are using Envoy to connect pods with external infrastructure, and how we debug issues when the tunnels are not working as expected.

Carlos Sanchez

Principal Scientist at Adobe

A Coruña, Spain