Session

Extreme data processing and HPC with Azure Batch

While tools like Apache Spark, RDBMS, Pandas, Polars, and DuckDB handle most data processing needs, some workloads simply don’t fit these technologies. Unstructured data, such as voice files needing transcription or IoT images requiring analysis, often falls outside their scope. Likewise, semi-structured and structured data can become cumbersome when intensive ML or AI model inference is required, or when dealing with countless small files.

In this session, we’ll explore just how simple an end-to-end solution can be using Azure Batch. You’ll learn:

* How to provision and update Azure Batch services with Infrastructure as Code (IaC).
* How to define Jobs and Tasks in any code editor (e.g., VS Code).
* How to integrate continuous integration and delivery (CI/CD).
* How to off-load results into Microsoft Fabric and other services.

By the end of this session, you’ll see that when traditional technologies like Spark reach their limits, Azure Batch offers a flexible alternative. One that is easy to implement, language-agnostic, and fully compatible with Git-based workflows.

Christian Henrik Reich

Cloud data architect

Copenhagen, Denmark

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top