Session

Floe: A Policy-Based Table Maintenance System for Apache Iceberg

Every Iceberg table needs maintenance: compaction, snapshot expiration, orphan cleanup. The procedures exist and work well. The challenge is applying them consistently across a catalog with hundreds of tables, each with different ingestion patterns.

Teams typically start with scripts, graduate to Airflow DAGs, and eventually lose track of which job owns which table. The execution layer is not the problem. Orchestration is.

Floe is an open-source system that lets you define maintenance behavior as policies. A policy specifies a table pattern (e.g., analytics.streaming.*), operations, and schedules. Floe matches tables from your catalog and delegates execution to Spark or Trino.

This talk explains the policy model, demonstrates priority-based pattern matching, and shows a live demo with REST and Polaris catalogs. Hive Metastore and Nessie are also supported.

Neelesh Salian

Software Engineer

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top