Speaker

Sahil Gandhi

Sahil Gandhi

Software Engineer on Confluent Cloud Control Plane

Actions

Sahil graduated from UCLA in March 2020 with a Bachelors in Computer Science and Engineering and a Masters in Computer Science, specializing in distributed and big data systems. Since then, he’s been one of the founding engineers on the Fleet Management team at Confluent where they build systems to efficiently and safely operate on the entire Confluent Cloud fleet. Outside of work, Sahil can often be found doing some outdoor activity, including running, hiking, bicycling, or if the weather is just right, exploring some new roads on his motorcycle.

Managing a Large Fleet of Kafka Clusters on Heterogeneous Clouds with Safety and Efficiency

Confluent Cloud hosts thousands of our customer’s Kafka, KSQL, Connector, and Schema Registry clusters on heterogeneous clouds. Managing such a large fleet of clusters poses some key challenges: all cluster lifecycle management must be performed by us to reduce customer toil; new product, security and data governance features must be shipped at a regular, speedy cadency; customers require zero downtime or interruption for all operations.

In this talk, we will discuss the set of fleet management tools that we’ve created to safely and efficiently manage clusters, key challenges we faced, and other observations we encountered. Some takeaways include:
- Deploying all products (Kafka, KSQL, etc) and infrastructure (networking, k8s, etc) as individually updatable components
- Being able to pre-define rollout plans with canary support and having a Web UI portal to trigger, observe, and operate the rollout
- Configuring rich monitoring to validate clusters during rollouts
- Carefully orchestrating maintenances at the pod level, ensuring sufficient data replication and service availability
- Emitting ongoing progress events and notifications to end users

With this rich DevOps experience, operators can work on the entire fleet with confidence and efficiency, product teams can quickly ship features without impacting customer workloads, and customers can gain insights on maintenance management.

Sahil Gandhi

Software Engineer on Confluent Cloud Control Plane

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top