Session

How to optimize Azure Synapse pipelines using SQL database meta data tables

Azure Synapse pipelines allow to easily ingest data from a wide range of sources, and to orchestrate pipelines.

I am accustomed to using Synapse pipelines exactly for these purposes. I am also used to working in multiple deployment environments (DEV, UAT and PROD).

One challenge that I stumbled upon with this kind of setup was the effort and time it would take to make updates to the list of tables to be ingested and datasets to be refreshed in PowerBI.

For instance, to ingest data from an additional table, I would need to update the relevant parameter in one of the Synapse pipelines in DEV environment. Then it would be submitted in a pull request, released to UAT, tested there. Only after a few weeks, would the new data start being ingested in PROD.

Doesn't this sound too complex for what it really is?

If you agree, then tag along and I will show you how to use a few meta data tables built in a SQL database to optimize Synapse pipelines.

In my talk, I will focus on 2 examples:
1. using meta data tables to optimize ingestion of data tables
2. using meta data tables to optimize automatic refresh of PowerBI datasets from Synapse pipelines

Note: the session includes a walk-through the solution and assumes familiarity with Azure Synapse pipelines or Azure Data Factory

Ivanna Jurkiv Ditlevsen

Data Engineer

Copenhagen, Denmark

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top