Thibauld Croonenborghs

Data Architect at AE

Brugge, Belgium

Actions

Thibauld Croonenborghs started his studies at the Belgian Royal Military Academy and worked as an army officer for several years while simultaneously studying computer science.
He started his career, but has since transitioned into the data field and is Python, Azure and Spark enthousiast.

He has several years of working on data platforms across several technology stacks, including Microsoft Fabric, Azure, Hadoop and Databricks.

Area of Expertise

Information & Communications Technology

Topics

Data Engineering
Azure
python
Azure Data Platform
aws
Data Architecture
Cloud data engineering
Cloud Data Architecture

Boost your Python testing with testcontainers

Writing tests that mimic production environments can be challenging, especially when dealing with dependencies like databases, message brokers, and external services

We will go through the following topics:
- What are testcontainers?
- Test environment setup
- Writing some test scenarios
- Integration with pytest and unittest
- Demo + some practical examples

From chaos to clarity: Understanding and using Lakehouses

The emergence of lakehouses marks a significant evolution in data architecture, offering a unified platform that brings together the scalability and flexibility of data lakes with the performance and management features of data warehouses. But what exactly are lakehouses, and why should you care?

In this talk, we'll demystify the lakehouse concept: what it is, how it works, and where it fits in your data stack. We'll explore the benefits of adopting a lakehouse architecture, common use cases, and how it enables more efficient data workflows. You'll also see how to interact with lakehouses using popular open-source technologies, with a demo to illustrate practical usage and integration.

From Apache Spark to Delta Lake: A practical introduction to scalable data engineering

This session will introduce you to the fundamentals of Apache Spark, the powerful distributed computing engine for big data processing. We’ll cover how it works, where it's commonly used, and when it’s the right choice for your data challenges. From there, we'll explore the lakehouse paradigm using Delta Lake and how it seamlessly integrates with Spark to enhance reliability and performance.

Through real-world examples and live demos, you'll see how Spark and Delta Lake work together to support both streaming and batch workloads in a unified architecture. We'll also dive deeper into the lakehouse approach—a modern architecture that combines the openness and flexibility of data lakes with the reliability and performance of data warehouses.

Demystifying Spark Profile Optimizations in Microsoft Fabric

Optimizing Spark workloads in Microsoft Fabric can be a complex endeavor, given the array of available configurations and techniques. Terms like V-Order, Z-Order, and considerations for read-heavy versus write-heavy profiles often add to the confusion. In this session, we'll bring clarity to these concepts, offering a structured overview of each optimization strategy and how it impacts performance on the table level of a Delta Lake table

Through practical examples and performance considerations, attendees will gain insights into how to fit these optimizations into their data architecture, whether dealing with read-intensive analytics or write-heavy data ingestion processes. By the end of this talk, you'll be equipped with the knowledge to make informed decisions, turning the complexity of Spark optimizations in Microsoft Fabric into a strategic advantage

Single-node technologies vs. Spark in Microsoft Fabric: Choosing the right tool for the job

In the early days of big data, distributed computing frameworks like Apache Spark and Hadoop became the de facto standards for processing massive datasets. Their ability to handle distributed computing made them essential for tackling large-scale data challenges. However, in today’s diverse data landscape, not all datasets qualify as "big data." For many use cases, single-node processing tools like Polars and DuckDB are proving to be compelling alternatives, offering exceptional performance, simplicity, and lower overhead compared to distributed frameworks.

Microsoft Fabric introduces a unique opportunity to leverage both worlds if necessary. By enabling Python notebooks within its ecosystem, Fabric allows you to build and execute pipelines using these modern single-node technologies. This flexibility ensures you can choose the most efficient tool for your specific workloads.

In this session, we will:

- Examine the evolution of data processing, contrasting distributed frameworks like Spark with single-node solutions.
- Explore how technologies like Polars and DuckDB operate, their strengths, and how they compare to Spark in performance and scalability.
- Evaluate use cases to determine which approach—distributed or single-node—fits best

From data engineer to AI-orchestrator: devloping a data platform with agentic coding

AI-assisted coding is rapidly changing how software is written, but what does that mean for data engineers building modern data platforms?

In this session, we explore how AI agents can accelerate development while maintaining architectural control and code quality. Using a minimal data platform setup as a foundation, we demonstrate how AI can move beyond simple autocomplete and become a structured development assistant.

In this session, you will learn:
• What agentic coding really means in a data engineering context
• The tools and setup required to get started
• How to improve AI agents using structured prompts, skills, and context engineering
• The role of MCP and plugins in enabling tool-aware AI workflows
• Common pitfalls, limitations, and how to avoid over-reliance on AI

The session concludes with a live demo where we enhance a minimal dbt project using an AI agent to add tests, refactor models, and extend functionality in real time.

Attendees will gain practical insight into how AI can increase productivity in data platform development while understanding where human expertise remains essential.

Thibauld Croonenborghs

Data Architect at AE

Brugge, Belgium

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Speaker

Thibauld Croonenborghs

Actions

Links

Area of Expertise

Topics

Sessions

Boost your Python testing with testcontainers

From chaos to clarity: Understanding and using Lakehouses

From Apache Spark to Delta Lake: A practical introduction to scalable data engineering

Demystifying Spark Profile Optimizations in Microsoft Fabric

Single-node technologies vs. Spark in Microsoft Fabric: Choosing the right tool for the job

From data engineer to AI-orchestrator: devloping a data platform with agentic coding

Thibauld Croonenborghs

Links

Actions