Session

Kubernetes Meets Climate Data: Designing Analysis-Ready Data Cubes with Kubeflow and Zarr

The exponential growth of Earth Observation (EO) data has transformed how the climate community processes and analyzes data. In previous years, projects like the European Weather Cloud (EWC) and Data Proximate Compute (DPC) enabled researchers to bring compute closer to data across multi-cloud environments. The next challenge is making these massive datasets analysis-ready for analytics, ML, and AI.

But the biggest obstacle lies in the EO data itself—spread across archives, stored in multiple formats (HDF5, NetCDF, GRIB), and organized into layers (L1–L3) with varying spatial and temporal resolutions. This fragmentation makes efficient access, reprocessing, and analysis extremely difficult.

This session presents how we built a cloud-native data pipeline using Kubeflow, DuckDB, and Xarray/Zarr to transform EO data into analysis-ready, multi-resolution data cubes. Attendees will learn how Kubeflow orchestrates large-scale scientific data preparation and object-store-native access.

Armagan Karatosun

Cloud & Data Services Expert - EUMETSAT

Griesheim, Germany

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top