Cracking the Data Locality Puzzle

Data transfer is slow -- so in AI and HPC, data locality matters. As workloads scale, optimizing where and how to run data-heavy workloads in Kubernetes becomes critical. Yet this area remains underexplored. The CNCF Batch Subproject shares findings from our work on data-locality-aware scheduling across clusters. Should we move compute to the data or the data to compute? What are the trade-offs in latency, cost, and efficiency?

We present methods to test potential policies: splitting jobs, exposing location-aware metadata from compute/storage, and basing scheduling on historical data and pricing. We share early discoveries from real-world tests across regions with limited bandwidth, storage, and power.

If your workloads are bottlenecked by data gravity -- or you’re chasing GPU efficiency across sites -- join us to explore emerging patterns for intelligent, cost-aware data placement in Kubernetes.

Abhishek Malvankar

Senior Software Engineer, Master Inventor at IBM Research

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Cracking the Data Locality Puzzle

Abhishek Malvankar

Links

Actions