Trends in Architecting AI Cloud Infrastructure at Scale: An I/O Perspective

While AI infrastructure optimization typically focuses on compute, storage I/O has become a hidden bottleneck limiting the performance and scalability of the infrastructure.

Several factors create significant performance challenges. Models and datasets are often too large for a single system, data ingestion and preprocessing can consume more power than training, and half of the GPU time is wasted on I/O stalls. Model checkpointing, essential for fault tolerance, introduces costly GPU idle time. Moreover, emerging applications like RAG and Vector Databases require massive storage for continuous data ingestion and real-time retrieval.

This session dissects the I/O patterns across the full AI data lifecycle, from ingestion and training to inference, sharing insights from production deployments. The speakers will demonstrate how open-source, tiered caching architectures can unlock performance in AI cloud infrastructure, bridging the gap between cloud storage and the demanding requirements of modern AI and LLM workloads.

Bin Fan

Founding Engineer, Alluxio

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Trends in Architecting AI Cloud Infrastructure at Scale: An I/O Perspective

Bin Fan

Links

Actions