Session

Cost cutting with constraints: Snorkel’s journey to a cost-efficient cluster supporting ML workloads

Though ever more popular, both traditional and generative AI workloads remain a very particular use-case for k8s. Often stateful and long-running, they are at odds with the k8s paradigm where stateless and interruptible workloads are first-class citizens.

Many practitioners, including our enterprise customers, turn to k8s for its autoscaling flexibility and cost optimization via bin-packing, which is now more important than ever. How do you run a lean cluster to service AI workloads while maintaining high throughput and minimizing pod interruptions?

This talk will showcase how our thinking has evolved in our journey innovating with native k8s features to support Snorkel's AI workloads in a flexible and cost-efficient way.

In the hope of starting conversations with practitioners handling AI workloads, we are excited to share an architecture that works for us from our experience both optimizing our internal cluster and deploying in the field with commercial and government customers.

Sam Huang

Research Infrastructure, Cantina Labs

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top