Session
What Got Us Here Won't Get Us There: The Future of Data Infrastructure and AI Agents
AI doesn't have a model problem. It has a data problem. Models are rapidly commoditizing. The real differentiator is the quality and quantity of data flowing into them, yet most enterprise data estates remain fragmented, inflexible, and riddled with conflicting systems. This isn't an accident. Decades of storage evolution, from mainframes to data warehouses to cloud platforms, have left organizations with layers of incompatible infrastructure never designed to work together, let alone serve the demands of modern AI.
This talk traces how we got here and makes the case for where we need to go: Open Data Infrastructure. More than a technology stack, Open Data Infrastructure is a design principle that prioritizes flexibility, interoperability, and data ownership while building explicitly for tomorrow's AI workloads. By consolidating fragmented systems around open standards (Apache Iceberg for lakehouse storage, interoperable catalogs and query engines, dbt for transformation, APIs for ingestion, and Apache Arrow via ADBC for high-performance data access) organizations can replace brittle, vendor-locked architectures with composable, future-proof foundations.
The stakes are rising. AI agents need to interact with trustworthy, structured data at a pace and scale that legacy infrastructure cannot support. Open Data Infrastructure provides the substrate that grounds agents in reality, enabling organizations to confidently delegate business outcomes to autonomous systems. If you're a data leader, data engineer, or AI engineer planning your next infrastructure investment, this session will reframe how you think about the relationship between your data platform and your AI ambitions.
Key Takeaways
1. Models are commodities. Data infrastructure is the differentiator. As foundation models converge in capability and cost, competitive advantage shifts to the organizations that can deliver high-quality data at scale. Investing in better models without fixing the underlying data platform yields diminishing returns.
2. Enterprise data fragmentation is a historical inevitability, not a failure of execution. Each era of data storage solved the problems of its time while creating the silos, incompatibilities, and rigidity that plague organizations today. Understanding this history is essential to breaking the cycle.
3. Open Data Infrastructure is a design principle, not a product. Flexibility, interoperability, and ownership are the pillars. Open standards like Apache Iceberg, dbt, Apache Arrow (ADBC), and API-driven ingestion allow organizations to build composable architectures that avoid vendor lock-in and adapt as requirements evolve.
4. AI agents raise the bar for data infrastructure by orders of magnitude. Agents operate autonomously, at machine speed, across large volumes of data. They require structured, trustworthy, and accessible data to deliver reliable business outcomes. Infrastructure that barely supports human-driven analytics will not survive this shift.
5. The organizations that invest in open, composable data foundations now will be the ones that successfully deploy agentic AI at scale. Open Data Infrastructure is not a future aspiration. It is the prerequisite for trusting autonomous systems to act on your behalf.
Andrew Madson
Head of Developer Relations at Fivetran | Author of "Apache Polaris - The Definitive Guide". Authoring "AI-Ready Data" for Wiley and "Data Transformation" for O'Reilly
Paris, France
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top