Session

Scaling Autonomous-Driving R&D: A 100 PB, 100 Billion-File Open-Source Elastic File System on K8s

Autonomous-driving R&D pipelines generate petabytes of multi-modal sensor, simulation, and annotation data each month. Teams therefore need a storage layer that scales with their fleets—without forcing them to rewrite trusted tools.
In this talk we present a cloud-agnostic, open-source elastic file system already running in production at multiple OEMs and robo-taxi startups:

- Single namespace beyond 100 PB / 100 billion files — backed by commodity object storage yet mounted as a local POSIX volume.

- Thousands of Kubernetes nodes (mixed GPU/CPU) share the same dataset for training, validation, and replay with zero duplication.

- Multi-protocol endpoints (POSIX, S3, HDFS, WebDAV) let CV, mapping, and simulation teams keep their existing workflows.

- Hybrid & multi-cloud deployment—public cloud or on-prem with Ceph, enabling fully air-gapped SDV Labs.

We will walk through the high-level design patterns—stateless clients, scale-out metadata, transparent tiering—and share measured outcomes: saved millions in storage cost, scaled single-volume capacity 10×, with painless vendor portability. Attendees will leave with an opinionated reference diagram they can implement immediately.

Rui Su

Open-source advocate and co-founder of JuiceFS, a cloud-native distributed file system

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top