Mike Diehl
Director of Data Engineering and Business Intelligence
Winnipeg, Canada
Actions
Mike Diehl is the Director of Data Engineering and Business Intelligence at Improving Winnipeg (formerly Imaginet), a Microsoft partner with offices in Winnipeg, Manitoba (Canada) and with Improving offices across Canada, United States, and South America. Mike has over 25 years of experience using Microsoft database technologies and is an expert in Agile Analytics and Microsoft Azure DevOps.
Mike has spoken many regional, national, and international SQL events over his long career, including the PASS Summit in Seattle Washington, SQLBITS in London UK, and at various SQL Saturday events across Canada and the USA, and other development conferences in Canada.
Area of Expertise
Topics
Microsoft Fabric Pipelines - Metadata-driven ETL patterns
Do you have dozens or even hundreds of dataflows and tasks in your data pipelines in Microsoft Fabric? Especially if you've developed SSIS packages and now you're trying pipelines in Fabric, you will likely benefit from using a metadata-driven approach, which can save you significant ETL development effort and drastically reduce the complexity of your Fabric pipelines.
This session shows how to build pipelines in Fabric that are driven by metadata in JSON files in OneLake. Using these patterns, your ETL development velocity can be significantly increased.
Once you've built a pipeline to handle one table load, you can easily load dozens or hundreds of tables in your pipeline just by adding the metadata in OneLake.
I'll show you the patterns I use for managing dependencies (load these tables before those tables), incremental loading (load only the data that has changed since the last time this executed), and how to parameterize various options.
If you learn these patterns and implement them yourself, you can spend far less time coding ETL processes and increase your productivity greatly.
Using SQL Database Projects in SQL Server Data Tools
Do you know how many orphaned objects are in your databases? Or references to columns, views, tables, or stored procedures that are long since removed or renamed and forgotten? Are you reluctant to change the name of a table or column because you don't know all the places it could be referenced?
Tracking the design changes of a SQL database over time and across multiple environments (dev, test, and prod) can be difficult. And if you have multiple team members who can make changes, then coordinating those changes is also challenging.
A database project in SQL Server Data Tools (SSDT, or Visual Studio) defines the schema of a database using the practice of "Desired State Configuration". This means that all the scripts in the database project are CREATE statements, and the scripts as a whole define the desired state of the database. Using this project source code, we can deploy to any target database and the deployment process will apply all the schema changes necessary to bring that target to the desired schema.
In this session I will show the important features of database projects and the development practices necessary to ensure that your deployments are as fast as possible and retain the existing data that may be present in the target database.
I'll show the value of the strong schema checking, refactoring support for object renaming, and how to implement default security, automated data patches, and support for referencing external objects, such as cross-database table references.
Finally I'll show how you can use a git repository in Azure DevOps and have code reviews and project compilation checks as part of a check-in operation.
If you use database projects in SSDT, you can be much more confident in the quality of your SQL code and in the reliability of your database deployments.
Untangling the Spaghetti: Patterns and Anti-Patterns in SQL Development
When you see examples of spaghetti code, you wish it came with a warning: "Abandon all hope, ye who enter here!"
Spaghetti code is a term used to describe code that is tangled and difficult to trace and maintain. No one wants to update spaghetti code because it is hard to tell what it does, how it is used, who uses it, and what the impact of changing it might be.
Frequently I encounter SQL servers and databases in enterprises that contain spaghetti code - usually in the form of three- and four-part object references (cross-database and cross-server references), or databases that lack clear, well-defined interfaces using schema names and database roles.
Three- and four-part name references in your SQL are a big "code smell" (and I don't mean a nice smell)! They are an indication that you have bad coupling and poor cohesion; coupling and cohesion are coding principles that are well-known by software developers and that database developers *should* also practice.
In this session I will show the most grievous bad practices, what sorts of problems and constrictions they cause, and how they can be re-written to achieve much better cohesion and looser coupling across all data in an enterprise.
Any DBA or developer that uses more than two part table or view names in their queries or stored procedures should attend!
How to support SQL Unit Testing with tSQLt, SSDT, and Azure DevOps
Do you use database projects in SQL Server Data Tools (SSDT)? Perhaps you also want to test your database code using a unit test framework?
One of the popular unit testing frameworks for SQL is tSQLt, which allows you to write unit tests in SQL stored procedures that can be used to verify your code correctness. As a framework, it installs a number of supporting objects in a target database, but you don't want to deploy those testing objects to your production database. But you probably do want to keep your unit tests in your source control repository. How can you do this?
This session shows how to use database projects in SQL Server Data Tools (SSDT, or Visual Studio) with Git repos in Azure DevOps to write unit tests using the tSQLt framework. You'll learn how to structure your solution in SSDT to keep the target database, the framework code, and the unit tests separate. You'll also see how to create a DevOps Build pipeline to deploy and execute the unit tests upon check-in to ensure that all unit tests pass before the code changes are committed.
(This session only covers coding unit tests very briefly - it focuses on how to organize your unit testing code and how to deploy and run the unit tests in Azure DevOps.)
If you want to add unit test code to your database project in SSDT, then come to this session to learn the best way how.
Azure DevOps Release Pipelines for SQL Databases, Azure Data Factories, and Analysis Models
Is your team still manually deploying databases, data factories, and analytical models (SQL Server/Azure Analysis Services databases)? Do you want to have well-defined, separate environments for development, testing, and production but your team can't consistently deploy the right code to the right place at the right time?
Azure DevOps features such as git repos and Azure Pipelines can consistently and reliably deploy your source code to target environments such as testing, pre-production, or production, on demand or automatically when code has been checked in.
Having an automated process to deploy these objects increases the velocity of an Agile team and increases the quality and reduces the risk of deployments by making them repeatable and consistent.
This session will show how to deploy SQL Databases, Data Factories, and Analysis Services data models to Azure using Azure DevOps release pipelines.
Azure Data Factory Metadata-Driven ETL Patterns
Do you have dozens or even hundreds of datasets and tasks in your pipelines in Azure Data Factory (ADF)? Especially if you've developed SSIS packages and now you're trying Azure Data Factory, you will likely benefit from using a metadata-driven approach, which can save you significant ETL development effort and drastically reduce the complexity of your Azure Data Factories.
This session shows how to build pipelines in Azure Data Factory that are driven by data in a table. Using these patterns, your ETL development velocity can be significantly increased.
Once you've built a pipeline to handle one table load, you can easily load dozens or hundreds of tables in your pipeline just by updating the metadata in one table.
I'll show you the patterns I use for managing dependencies (load these tables before those tables), incremental loading (load only the data that has changed since the last time this executed), and how to parameterize various options.
If you learn these patterns and implement them yourself, you can spend far less time coding ETL processes and increase your productivity greatly.
DevOps for Databricks: Bundle up!
How do you deploy your Databricks notebooks and workflows to different environments for testing and production use?
Databricks' command line interface now supports deploying objects from source code files using YAML files known as Bundles.
This session will show the basics of creating bundle files and using Azure DevOps to deploy workflows and notebooks into Azure Databricks.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top