AI Agents as Data Engineers: What Actually Happens When You Let Them Loose

AI coding agents are getting serious attention as productivity tools for data teams. But what happens when you give one a real data engineering task with a live audience watching?

We ran that experiment.

We tasked two leading AI agents, Claude Code and OpenAI Codex, with building a real-time market data pipeline from scratch: connect to a live crypto feed, ingest order book data into a time-series database, materialize aggregates at multiple intervals, and ship a working Grafana dashboard with OHLC, VWAP, and Bollinger Bands.

The goal: zero to running pipeline in under two minutes.

Both agents could write the code. The failures came at the operational layer: managing background processes, sequencing setup steps, and knowing when a dependency was truly healthy before moving forward.

One agent repeatedly deployed an empty dashboard while ingestion was still starting up. Rephrasing the instructions never fixed it.

The real lesson had nothing to do with prompting.

By moving a data-presence check into the deploy script itself, the agent physically could not complete the task in the wrong order. Architecture enforced what instructions never could.

This talk shares an honest, experience-based account of where current AI coding agents succeed and fail at data engineering tasks, and a practical framework for designing pipelines that work reliably with agentic tooling, not despite its limitations, but around them.

Javier Ramirez

Developer —and Agent— Advocate at QuestDB. Fan of open source, developer communities, and data/ML. All around happy person. He/him

Madrid, Spain

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

AI Agents as Data Engineers: What Actually Happens When You Let Them Loose

Javier Ramirez

Links

Actions