Speaker

Bin Fan

Bin Fan

Founding Engineer, Alluxio

Actions

Bin Fan is the founding engineer and VP of Technology at Alluxio, Inc. Prior to Alluxio, he worked for Google to build the next-generation storage infrastructure. Bin received his Ph.D. in Computer Science from Carnegie Mellon University on the design and implementation of distributed systems.

Trends in Architecting AI Cloud Infrastructure at Scale: An I/O Perspective

While AI infrastructure optimization typically focuses on compute, storage I/O has become a hidden bottleneck limiting the performance and scalability of the infrastructure.

Several factors create significant performance challenges. Models and datasets are often too large for a single system, data ingestion and preprocessing can consume more power than training, and half of the GPU time is wasted on I/O stalls. Model checkpointing, essential for fault tolerance, introduces costly GPU idle time. Moreover, emerging applications like RAG and Vector Databases require massive storage for continuous data ingestion and real-time retrieval.

This session dissects the I/O patterns across the full AI data lifecycle, from ingestion and training to inference, sharing insights from production deployments. The speakers will demonstrate how open-source, tiered caching architectures can unlock performance in AI cloud infrastructure, bridging the gap between cloud storage and the demanding requirements of modern AI and LLM workloads.

How to Build a Cheap and Scalable Feature Store on S3 with 1000× Acceleration

Using Parquet on S3 as a lightweight feature store is becoming common. But querying petabyte-scale data lakes directly from cloud object storage remains painfully slow—often with latencies in the hundreds of milliseconds, and inconsistent performance at scale.

In this talk, we’ll walk through how to turn your S3-based Parquet data lake into a high-performance feature store—without rearchitecting your stack, rewriting data, or buying expensive hardware.

We present a system architecture co-designed with Alluxio, acting as a high-throughput, low-latency S3 proxy. This layer delivers sub-millisecond Time-to-First-Byte (TTFB)—comparable to Amazon S3 Express—while remaining fully compatible with existing S3 APIs. In production benchmarks, a 50-node Alluxio cluster achieves over 1 million S3 ops/sec—50× the throughput of S3 Express—at predictable latency and low cost.

To further optimize feature lookups and point queries, we introduce pluggable Parquet pre-processing inside the Alluxio proxy. This offloads index scans and row filtering from the query engine, enabling record-level lookups at 0.3 microseconds latency and 3,000 QPS per core—100× faster than traditional approaches.

This talk is ideal for teams building ML platforms or feature stores on top of cloud-native storage who want speed without the spend.

A Case Study in API Cost of Running Analytics in the Cloud at Scale with an Open-Source Data Stack

The migration of data-intensive analytics applications to cloud-native environments promises enhanced scalability and flexibility but introduces complex cost models that pose new challenges to traditional optimization strategies. While on-premises setups focused on speed, cloud deployments require a more nuanced approach, factoring in cloud storage operations costs, which can escalate rapidly in real-world scenarios.

In this presentation, Bin will analyze these challenges through a case study on Uber's large deployment analytics SQL platform on HDFS and GCS. They will show their findings of unexpected cost implications with standard I/O optimizations like table scans, filters, and broadcast joins when implemented in cloud environments. He will also highlight the need for a paradigm shift in optimizing data-intensive applications for the cloud and advocate for developing new I/O strategies, balancing performance and costs while tailored to cloud ecosystems' unique demands.

Open Source Summit + AI_dev: Open Source GenAI & ML Summit Japan 2024 Sessionize Event

October 2024 Tokyo, Japan

Community Over Code NA 2024 Sessionize Event

October 2024 Denver, Colorado, United States

Bin Fan

Founding Engineer, Alluxio

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top