Erwin de Kreuk
Data Platform MVP | Lead Data and AI |Public Speaker | InSpark | Innovate to Accelerate
Erwin de Kreuk is a passionate and very experienced Microsoft Solution Architect.
Working as a Principal Consultant/ Lead Data and AI for InSpark in the Netherlands. Speaking at different national and international data community events. He is been awarded as Data Platform MVP.
He is working in the world of data on the Microsoft Platform for last 14 years and the last 6 years he has shifted his focus to the Azure Platform.
Answering complex customer cases and technical issues are part of his day-to-day work. In addition to this work, he is a member of the Technology Board within InSpark and leads a team of highly experienced Data Expert in the field of Microsoft Data Platform.
He is eager in helping out customers in getting the most added value out of their complex Analytics environment with a strong focus on solutions in the Azure Cloud (Platform as a Service).
As a Technology Board member, he is always investigating the latest (vs newest) possibilities/opportunities and sharing his enthusiasm among his colleagues, the community and customers. He is one of the main Stakeholders for the InSpark Solution (Managed) Oxygen, a Modern Data platform Estate as-a-service.
Area of Expertise
Securing Azure Synapse: Best Practices for Data Protection, Compliance, and Performance
Discover the essential steps to build a secure Azure Synapse Solution in today's data-driven world. This session provides comprehensive knowledge and practical guidance on implementing robust security measures, including the Cloud Adoption Framework (CAF), Well-Architected Framework (WAF), Data Exfiltration Protection, (Managed) Private Endpoints, and secure connections.
During this session, you will:
• Learn how to implement a secure and compliant Azure Synapse Solution within the Cloud Adoption Framework (CAF) and its core components.
• Delve into the five pillars of the Well-Architected Framework, understanding how to apply them to Azure Synapse Solution for enhanced security, reliability, performance, and cost optimization.
• Gain techniques to implement data exfiltration protection measures, including access controls, data classification, and auditing, safeguarding your sensitive data from unauthorized extraction.
• Discover the benefits of (Managed) Private Endpoints and learn how to establish secure connections between your Azure Synapse workspace and data sources.
• Learn various methods to secure connections, including Azure Virtual Network (VNet) service endpoints, Azure Private Link, and SSL/TLS encryption, necessary for a building a secure Azure Synapse Solution
• Experience a mix of slides, demos, and hands-on exercises throughout the session.
By the end of the session, you will have a solid understanding of how to build a secure Azure Synapse solution, integrating CAF, well-designed framework, data exfiltration protection, (managed) private endpoints, and secure connections. You will be equipped with actionable insights to ensure the security and compliance of your Azure Synapse workloads, effectively protecting your data.
Note: This session assumes a basic understanding of Azure Synapse Analytics and cloud services.
How to govern the Microsoft Intelligent Data Platform
So you have heard about the Microsoft Intelligent Data Platform, which includes Azure Synapse Analytics, Power BI, and Microsoft Purview and started making your first experiences with it?
Then it is time we talk about the importance of data governance, data classification, and data labeling to maintain data security and compliance.
In this full-day workshop, you will learn how protect your sensitive data in reports and dashboards using techniques like sensitivity labels in Azure Synapse Analytics or labels, policies, and rules in Power BI. We will also walk through the steps required to extend your Power BI Lineage with Lineage from your sources with the help of Purview and Synapse Analytics.
In addition, we will cover setting up access controls and permissions to ensure that only authorized users can access sensitive data.
We will have a good mix of slides, demos and hands-on exercise, allowing you to apply what you have learned using the Microsoft Intelligent Data Platform!
Make your Azure Synapse Analytics a stronghold and win the workshop!
Want to learn how to build a secure by design Azure Synapse Solution? Join us for an action-packed day!
The goal of today’s workshop is to provide guidance on building a secure and cost-effective data adventure and on making the technologies work together seamlessly and securely.
The workshop is led by the **Quiz Master**, so gather your team to follow the workshop.
Building a secure by design Azure Synapse Analytics adventure is not something what we roll out by default, we have to make our strategy well in advance.
In the morning we will start with the first adventure. This is the design adventure, the teams will use the different security design principles from the Well Architecture Framework (WAF).
The next adventure is the deployment, we should work carefully, some configuration matters and can only be set from the first moment and are irreversible, so making the right decision are very important here. During the first part of this adventure, we will learn how to configure, build and to secure an Azure Synapse Analytics Solution.
Data exfiltration Protection, (Managed) Private Endpoints and securing connections are settings from this adventure.
In the afternoon we will finalize the deployment adventure, with a strong focus on how to manage access control before we start the last adventure.
The solution is now built and the Synapse Workspace is ready for use, the teams will look, how they can build and transform the Pipelines in Azure Synapse Analytics in a safe way with the help of Azure Key Vault and by applying policies. Policies ensure that we can enforce certain configuration settings.
At the end of the day, everyone exactly knows how to build their Azure Synapse Analytics Solution, completely accurately and what building secure by design adventures will do with their costs.
With the final quiz we will see which team have won the workshop.
This workshop is suitable for a mix of data engineers, data scientist and cloud engineers each with their own strong and security weaknesses.
Provision users and groups from Azure Active Directory to Azure Databricks
This session will cover provisioning users and groups from Azure Active Directory (AAD) to Azure Databricks using System for Cross-domain Identity Management (SCIM).
The session will include an overview of SCIM and its integration with Azure Databricks, as well as a walkthrough of the steps to provision users and groups using SCIM. Topics such as user and group mappings, SCIM configuration, and user and group management options will also be discussed.
We will also discuss the different options for managing user and group identities in Azure Databricks, including how to handle user and group provisioning, deprovisioning, and updates.
By the end of this session, attendees will have a comprehensive understanding on how to provision users and groups from AAD to Azure Databricks using SCIM, and how to manage user and group identities in a scalable and secure manner in the cloud.
Extending Power BI governance with Microsoft Purview
In this deep dive session, we will explore how Power BI and Microsoft Purview can work together to provide a comprehensive data governance and analytics solution. We will start by discussing the key features of each platform and how they complement each other, but also make sure where they differ from each other
Next, we will dive into real-world examples of how Power BI and Purview can be used together to gain insights from data. This will include using Purview to discover, classify, and catalog data sources. But before you can scan your Power BI tenant, we will learn you how to setup these scans within your tenant but also in a cross-tenant situation.
We will also discuss best practices and the do's and don'ts for integrating Power BI and Purview into your organization's data governance strategy, including considerations for data security and compliance.
You will learn how you can extend your Power BI Lineage with Lineage from your sources with the help of Purview.
By the end of this session, you will have a clear understanding of how Power BI and Purview can be used together to drive data-driven decision making in your organization.
Designing and managing a cost-effective data platform in Azure Synapse Analytics
Azure Synapse Analytics is a powerful data platform, but it can also be expensive if you don't know what you're doing. In this session, we will go through the different components of Azure Synapse Analytics and discuss how to design a cost-effective data platform.
We will cover topics such as choosing the right pricing tier, optimizing data storage and processing, and leveraging built-in cost management features. We will also discuss how to optimize your data platform for cost efficiency by using features such as serverless compute, pre purchase compute and reserved capacity.
Attendees will leave with a better understanding of how to design and manage their Azure Synapse Analytics platform for cost efficiency and how to design a cost-effective data platform that meets your organization's needs.
Topics we will cover:
- Pricing Models for compute Resources in Azure Synapse Analytics
- Storage Types and Tiers
- Pricing Models
- Optimizing Cost with Resource-Scaling
- Using serverless compute and reserved capacity options for cost savings
Building a Metadata driven Synapse Analytics ELT Framework
A metadata-driven ELT framework in Azure Synapse Analytics is a way of organizing,optimizing and managing data pipelines that involves using metadata to define and control the flow of data from source to destination. This can be useful for organizations that have a large number of data pipelines and want to have more control over how data is processed and moved between systems.
In this session, we will discuss the benefits of using a metadata-driven approach to managing data pipelines in Azure Synapse Analytics. Our discussion will include practical examples and best practices for implementing a metadata-driven ELT framework in your organization based on the Medallion Lakehouse architecture. We will also provide you with code samples and walk them through how to get started with implementing this framework in your own work.
This session is ideal for data engineers and other technical professionals who want to learn how to optimize their data pipelines in Azure Synapse Analytics. By the end of the session, you will have a clear understanding of how to use a metadata-driven approach to manage and maintain data pipelines, enabling better control and visibility over their data processes.
Data Governance with Microsoft Purview - Ask the Experts
In this open session, it's your chance to ask our panel of 3 MVPs about data governance for your business using Azure Purview
Managing Data Access at Scale with Purview Data Policies
Microsoft Purview's Data Policies app is a powerful tool for managing access to data systems across your organization's entire data estate. In this session, we will explore the benefits of using data policies to manage access to data, and how they can help you to streamline and scale access provisioning.
The big advantage of these policies is that you do not have to apply RBAC roles and that you have a single-pane of glass which is a cloud-based experience that enables at-scale provisioning of access to data.
Are you a data consumer? Then the self-service access policy is an easy way to request access to data while browsing or searching for data.
Are you a data producer? Then access policies will help you to easily create and publish access to data sources.
Are you a DBA or Developer? Then DevOps policies are a simple, central, cloud-based experience that allows you to provision access at scale to DBAs and other DevOps users
In this session, I will explain and show you how to create and publish data policies, devops policies and how to set up a self-service access workflow.
Create an Azure Synapse Lake Database without writing code
So, I don't have to write any code to build up my facts and dimensions, yes you have read that correctly.
Within Azure Synapse Analytics, a new functionality/tool is available, the map data tool. The map data tool allows you to easily map your Data from a source into the target tables in the Synapse Lake Database.
Map Data is a guided experience where you can generate a mapping data flow without having to start from a blank canvas. Once you have created the mappings then you can easily generate a scalable mapping data flow in a Synapse Pipeline.
After you have published the Synapse Pipelines, you can run these Pipelines and then visualize your generated data model in Power BI? Sounds great or not?
I will show you how the map data tool works and how to visualize the data in Power BI afterwards in a step-by-step demo-based session. After this session you will have the knowledge to build and visualize your first Synapse Lake Database.
Lifecycle Management for Azure Synapse Analytics
Building a secure data platform by design is very important these days. How do we ensure that we keep our InfoSec happy and that our policies do not fail?
Connection string, username and passwords needs to be stored as secrets in de Azure Key Vault.
• How can we apply the secrets in Azure Synapse
• How do we deploy Synapse Pipelines or code in Azure DevOps to Test, Acceptance and Production environments?
• Can this be setup dynamically?
During this session I will walk you through some design decisions and give answer on above questions.
You will learn how to build and validate your Synapse Workspace in Azure DevOps, how to secure your connection strings and finally deploy your code and pipelines (CI/CD).
A basic knowledge of Azure Synapse and Azure DevOps can be useful to understand this session well.
By the end of the session, you're ready to implement the deployment in your projects and to make your InfoSec happy.
Streamline Data Governance with Microsoft Purview
In this session, I will give you a Comprehensive Overview of Microsoft Purview, a unified data governance solution designed to help organizations manage their data more effectively.
Data governance is critical to any organization's data strategy as it ensures data quality, security, and compliance. However, it can be a complex process, especially when managing large volumes of data across various systems and platforms.
During this session, we will be discussing the key features of Microsoft Purview, including data discovery and classification, data cataloging and management, and data lineage and mapping. We will also explore how Purview integrates with other Azure services such as Azure Synapse Analytics, Azure Data Factory, and Azure Databricks to enable end-to-end data governance.
This session is perfect for data professionals, architects, and CDO's who want to learn more about how Purview can help them overcome their data governance challenges.
By the end of the session, you will have a clear understanding of how Purview can streamline data governance and enhance data quality, which will ultimately lead to better decision-making and collaboration within their organizations.
Microsoft Purview what does this mean to me as an organization?
Microsoft Purview brings together data governance from Microsoft Data and AI, along with compliance and risk management from Microsoft Security and is now complemented with many other solutions
But what's in for me as an organization?
• Which solutions does Microsoft Purview actually include?
• Which solutions can be easily deployed in my organization?
• Which portals can I use now and for which solution?
An agreeable series of questions that we will answer during this session. At the end of the session, you will have an answer on what Microsoft Purview could mean in your organization.
How to use and create Data Lineage in Microsoft Purview?
The use of data Lineage is a hot topic for many organizations.
Many organizations struggle with answers to the following questions:
• I want to adjust a measure, but where do I have to adjust it and where does the data come from?
• What will be the effect on my data if I rename this column in the source?
• Can I visually overview my Data Estate including how the data has been transformed?
In this session, we will explore the concept of data lineage and how it can be used in Microsoft Purview to provide a comprehensive understanding of data flows and data transformation across various data sources. We will cover how to create and visualize data lineage in Purview, including the use of Purview scanners and connectors, and the integration with Azure Synapse Analytics and how to use Custom Lineage components for unsupported data sources with Apache Atlas.
The session is intended for data professionals, such as data architects, data engineers, data analysts, and data scientists, who are interested in using Microsoft Purview for data governance and management. The session would be beneficial for those who are already familiar with basic concepts of data management and would like to learn more about using data lineage in Purview.
Get control of your Azure Synapse environment, define your access control the right way today!
Azure Synapse Analytics is Microsoft's analytical engine that brings together data integration, enterprise data warehousing, and big data analytics.
As we now take a more holistic approach, more different types of user groups will use the platform. The more important the setup of an authorization matrix in advance will be. The following topics will be covered during this session:
• What Azure AD roles do we need to deploy an Azure Synapse Workspace?
• How can we simplify access control by using security groups that are aligned with people's job roles.
• How do we handle different user personas in Azure Synapse Analytics? For example, what is a Data Scientist or Data Engineer allowed to do and what not?
What access control settings do we need to have to:
• Store code in Azure Devops
• Debug a pipeline
• Run a Notebook
During this session I would like to take you through some practical examples on how you can set up these roles for your Azure Synapse in order to get in to control of your environment.
Scale your SQL Pool dynamically in Azure Synapse
In this Lightning talk I explain how can you scale up or down your SQL Pool in Azure Synapse Analytics using an Synapse Pipeline. An easy way so save some cost in your Analytics Environment
Getting started with building your Azure Synapse environment
In this session, we will explore how to create an Azure Synapse environment, step by step. Azure Synapse Analytics is a cloud-based analytics service that brings together big data and data warehousing. It offers a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs.
During this session, I will cover the pre-requisites necessary for creating an Azure Synapse environment. You will be guided through the steps to create an Azure Synapse Workspace and to create Synapse SQL, Serverless and Spark pools for data processing and ingestion. We'll also explore the various methods for loading data into Synapse.
Finally, I will demonstrate how you easily can query your data.
By the end of this session, you will have a clear understanding of how to create an Azure Synapse environment, load data into it, and query the data efficiently.
DATA:Scotland 2023 Upcoming
Data Platform Next Step Upcoming
Erwin de Kreuk
Data Platform MVP | Lead Data and AI |Public Speaker | InSpark | Innovate to Accelerate