Session
AI Ready Data with Apache Iceberg: Unifying, Controlling, and Optimizing Your Data for Effective AI
Title: AI Ready Data with Apache Iceberg: Unifying, Controlling, and Optimizing Your Data for Effective Artificial Intelligence
Target Audience:
Data engineers
Data scientists
Data architects
Technical leaders (CTOs, CIOs)
Anyone interested in improving data quality for AI/ML initiatives
Abstract
In today's data-driven world, the effectiveness of Artificial Intelligence (AI) and Machine Learning (ML) models depends heavily on the quality and organization of your underlying data. "AI Ready Data with Apache Iceberg" addresses this challenge and describes how Apache Iceberg can facilitate unifying, governing, and optimizing your data, making it truly AI ready.
Key Takeaways:
The Data Lakehouse Advantage:
Explain how Apache Iceberg, combined with the lakehouse architecture, provides a unified platform for all types of data, breaking down silos and simplifying data management.
Git-Like Data Governance with Nessie:
Introduce Nessie and demonstrate how its Git-like functionality brings version control, branching, and collaboration to your data, enabling efficient experimentation and ensuring data reproducibility.
Data Contracts for Quality Assurance:
Discuss the concept of data contracts and how they can be used to define and enforce quality standards, ensuring that data meets the necessary criteria for AI/ML workloads.
Iceberg's Optimized Data Structures:
Highlight how Iceberg's optimized data layouts (e.g., columnar formats, partitioning, hidden partitioning) improve query performance and resource utilization, leading to faster AI/ML model training and inference.
Real-World Use Cases:
Share examples of how organizations are using Iceberg, Nessie, and data contracts to build robust data pipelines, enhance data quality, and achieve tangible results with their AI initiatives.
Andrew Madson
Dremio | Data Science, AI, and Analytics Evangelist
New City, New York, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top