Erwin de Kreuk
Data Platform MVP | Lead Data and AI |Public Speaker | InSpark | Innovate to Accelerate
Erwin de Kreuk is a passionate and very experienced Microsoft Solution Architect.
Working as a Principal Consultant/ Lead Data and AI for InSpark in the Netherlands. Speaking at different national and international data community events. He is been awarded as Data Platform MVP.
He is working in the world of data on the Microsoft Platform for last 14 years and the last 6 years he has shifted his focus to the Azure Platform.
Answering complex customer cases and technical issues are part of his day-to-day work. In addition to this work, he is a member of the Technology Board within InSpark and leads a team of highly experienced Data Expert in the field of Microsoft Data Platform.
He is eager in helping out customers in getting the most added value out of their complex Analytics environment with a strong focus on solutions in the Azure Cloud (Platform as a Service).
As a Technology Board member, he is always investigating the latest (vs newest) possibilities/opportunities and sharing his enthusiasm among his colleagues, the community and customers. He is one of the main Stakeholders for the InSpark Solution (Managed) Oxygen, a Modern Data platform Estate as-a-service.
Area of Expertise
Microsoft Fabric is an all-in-one analytics solution that enables you to build and manage lakehouses, data warehouses, and data integration pipelines with ease and efficiency. In this session, you will learn how to use the medallion architecture design to organize and transform your data across bronze, silver, and gold layers of a lakehouse for optimized analytics. You will also learn how to connect to your lakehouse using SQL endpoints and Power BI, and how to ensure the security and governance of your data. By the end of this session, you will have a solid understanding of the benefits and best practices of using the medallion architecture in Microsoft Fabric.
Leveraging Microsoft Fabric with Existing Azure Data Services: Integration and Migration Strategies
"Embark on a high-flying data governance adventure with Wolfgang, an Austrian data governance expert, and Erwin, a Dutch data stewardship virtuoso. In this aviation-themed session, participants will explore the dynamic landscape of data governance within Microsoft Fabric and Microsoft Purview.
Tailored for beginners, this session will cover the foundational principles of data governance, providing a solid framework for effective implementation. Attendees will be taken on an interactive tour, leveraging demo movies to gain practical insights into harnessing the capabilities of Microsoft Fabric and Purview.
Wolfgang and Erwin will guide participants through strategies for seamless data governance operations, ensuring compliance and security while optimizing workflows. Special attention will be given to the complexities of international data governance, enabling participants to navigate global data landscapes with confidence.
Whether you're a novice or seasoned professional, this 100-minute journey promises a turbulence-free introduction to data governance. Fasten your seat belts for a session packed with actionable insights, empowering you to chart a clear course for data governance success."
Explore the power of Microsoft Fabric in an intensive and informative workshop led by three expert speakers. Together, they will take you on a deep dive into the extensive range of possibilities that Fabric has to offer.
The day begins with a brief introduction, after which the workshop kicks into high gear. From setting up a Medallion Architecture with integrated Data Pipelines and Notebooks to harnessing dataflows, you'll gain a comprehensive understanding.
Discover how to enrich your data with Azure Open AI and then learn how to visualize this information in Power BI and how to implement a proper governance.
Wrapping up the day by setting alerts for your data with the new kid in town Data Activator.
Experience with Data Factory, Data Engineering, Data Science, Data Warehouse, Power BI, and Data Activator.
Basic familiarity with Fabric is recommended.
Demonstrations will be provided throughout the journey.
This program offers a comprehensive learning experience for attendees interested in Microsoft Fabric and its applications in data management and analytics. The combination of theoretical overviews and practical demonstrations ensures that participants leave with a thorough understanding of the subject matter. It's a unique and engaging way to learn about these technologies while in transit.
In today's fast-paced business landscape, real-time analytics have become indispensable for organizations that want to stay ahead of the competition. By enabling instant access to up-to-date information and enabling data-driven decision-making, real-time analytics are proving critical in a variety of industries and scenarios, from manufacturing operations to cybersecurity and beyond. Microsoft Fabric offers a comprehensive set of tools and services that facilitate the development of robust real-time analytics capabilities.
Join this session and dive into the world of real-time analytics with Microsoft Fabric. Learn how to build an Eventstream through a step-by-step approach, capturing and processing data as it happens for reporting and decision-making purposes. Understand the key features and functionality of Microsoft Fabric, including real-time data processing, advanced analytics, flexible visualizations, and custom alerts. Learn how real-time analytics can revolutionize IoT analytics, telemetry data analytics, human and system log investigation, and more.
Whether you are an date engineer, a data analyst or a business decision maker, this session will provide valuable insights
In this session, we delve deeper into OneLake, a crucial component of Microsoft Fabric that serves as a data lake-as-a-service solution. OneLake enables organizations to avoid data silos and centrally store and manage data without the need to build or maintain a data lake themselves. It functions as a data storage platform, much like OneDrive does for files.
During this session, we explore how OneLake works and why it is a true game changer. We discuss the various capabilities of OneLake, including out-of-the-box governance features such as data lineage, data protection, certification, and catalog integration. These features facilitate streamlined data management and enhanced compliance.
Furthermore, we examine the integration of OneLake with other services, such as Power BI. Discover how applying a sensitivity label to a OneLake file automatically applies to related Power BI datasets, ensuring consistent security and compliance.
Whether you're a data engineer, data scientist, or analyst, this session provides valuable insights into how OneLake can help centralize and manage data while leveraging the scalability, security, and advanced capabilities of Microsoft Fabric. Get ready to explore the possibilities of OneLake and understand why it is a critical component of the modern data landscape.
I look forward to welcoming you to this engaging session, and learn you how OneLake can make a difference in your organization.
Discover the essential steps to build a secure Azure Synapse Solution in today's data-driven world. This session provides comprehensive knowledge and practical guidance on implementing robust security measures, including the Cloud Adoption Framework (CAF), Well-Architected Framework (WAF), Data Exfiltration Protection, (Managed) Private Endpoints, and secure connections.
During this session, you will:
• Learn how to implement a secure and compliant Azure Synapse Solution within the Cloud Adoption Framework (CAF) and its core components.
• Delve into the five pillars of the Well-Architected Framework, understanding how to apply them to Azure Synapse Solution for enhanced security, reliability, performance, and cost optimization.
• Gain techniques to implement data exfiltration protection measures, including access controls, data classification, and auditing, safeguarding your sensitive data from unauthorized extraction.
• Discover the benefits of (Managed) Private Endpoints and learn how to establish secure connections between your Azure Synapse workspace and data sources.
• Learn various methods to secure connections, including Azure Virtual Network (VNet) service endpoints, Azure Private Link, and SSL/TLS encryption, necessary for a building a secure Azure Synapse Solution
• Experience a mix of slides, demos, and hands-on exercises throughout the session.
By the end of the session, you will have a solid understanding of how to build a secure Azure Synapse solution, integrating CAF, well-designed framework, data exfiltration protection, (managed) private endpoints, and secure connections. You will be equipped with actionable insights to ensure the security and compliance of your Azure Synapse workloads, effectively protecting your data.
Note: This session assumes a basic understanding of Azure Synapse Analytics and cloud services.
So you have heard about the Microsoft Intelligent Data Platform, which includes Azure Synapse Analytics, Power BI, and Microsoft Purview and started making your first experiences with it?
Then it is time we talk about the importance of data governance, data classification, and data labeling to maintain data security and compliance.
In this full-day workshop, you will learn how protect your sensitive data in reports and dashboards using techniques like sensitivity labels in Azure Synapse Analytics or labels, policies, and rules in Power BI. We will also walk through the steps required to extend your Power BI Lineage with Lineage from your sources with the help of Purview and Synapse Analytics.
In addition, we will cover setting up access controls and permissions to ensure that only authorized users can access sensitive data.
We will have a good mix of slides, demos and hands-on exercise, allowing you to apply what you have learned using the Microsoft Intelligent Data Platform!
Want to learn how to build a secure by design Azure Synapse Solution? Join us for an action-packed day!
The goal of today’s workshop is to provide guidance on building a secure and cost-effective data adventure and on making the technologies work together seamlessly and securely.
The workshop is led by the **Quiz Master**, so gather your team to follow the workshop.
Building a secure by design Azure Synapse Analytics adventure is not something what we roll out by default, we have to make our strategy well in advance.
In the morning we will start with the first adventure. This is the design adventure, the teams will use the different security design principles from the Well Architecture Framework (WAF).
The next adventure is the deployment, we should work carefully, some configuration matters and can only be set from the first moment and are irreversible, so making the right decision are very important here. During the first part of this adventure, we will learn how to configure, build and to secure an Azure Synapse Analytics Solution.
Data exfiltration Protection, (Managed) Private Endpoints and securing connections are settings from this adventure.
In the afternoon we will finalize the deployment adventure, with a strong focus on how to manage access control before we start the last adventure.
The solution is now built and the Synapse Workspace is ready for use, the teams will look, how they can build and transform the Pipelines in Azure Synapse Analytics in a safe way with the help of Azure Key Vault and by applying policies. Policies ensure that we can enforce certain configuration settings.
At the end of the day, everyone exactly knows how to build their Azure Synapse Analytics Solution, completely accurately and what building secure by design adventures will do with their costs.
With the final quiz we will see which team have won the workshop.
This workshop is suitable for a mix of data engineers, data scientist and cloud engineers each with their own strong and security weaknesses.
This session will cover provisioning users and groups from Azure Active Directory (AAD) to Azure Databricks using System for Cross-domain Identity Management (SCIM).
The session will include an overview of SCIM and its integration with Azure Databricks, as well as a walkthrough of the steps to provision users and groups using SCIM. Topics such as user and group mappings, SCIM configuration, and user and group management options will also be discussed.
We will also discuss the different options for managing user and group identities in Azure Databricks, including how to handle user and group provisioning, deprovisioning, and updates.
By the end of this session, attendees will have a comprehensive understanding on how to provision users and groups from AAD to Azure Databricks using SCIM, and how to manage user and group identities in a scalable and secure manner in the cloud.
In this deep dive session, we will explore how Power BI and Microsoft Purview can work together to provide a comprehensive data governance and analytics solution. We will start by discussing the key features of each platform and how they complement each other, but also make sure where they differ from each other
Next, we will dive into real-world examples of how Power BI and Purview can be used together to gain insights from data. This will include using Purview to discover, classify, and catalog data sources. But before you can scan your Power BI tenant, we will learn you how to setup these scans within your tenant but also in a cross-tenant situation.
We will also discuss best practices and the do's and don'ts for integrating Power BI and Purview into your organization's data governance strategy, including considerations for data security and compliance.
You will learn how you can extend your Power BI Lineage with Lineage from your sources with the help of Purview.
By the end of this session, you will have a clear understanding of how Power BI and Purview can be used together to drive data-driven decision making in your organization.
Azure Synapse Analytics is a powerful data platform, but it can also be expensive if you don't know what you're doing. In this session, we will go through the different components of Azure Synapse Analytics and discuss how to design a cost-effective data platform.
We will cover topics such as choosing the right pricing tier, optimizing data storage and processing, and leveraging built-in cost management features. We will also discuss how to optimize your data platform for cost efficiency by using features such as serverless compute, pre purchase compute and reserved capacity.
Attendees will leave with a better understanding of how to design and manage their Azure Synapse Analytics platform for cost efficiency and how to design a cost-effective data platform that meets your organization's needs.
Topics we will cover:
- Pricing Models for compute Resources in Azure Synapse Analytics
- Storage Types and Tiers
- Pricing Models
- Optimizing Cost with Resource-Scaling
- Using serverless compute and reserved capacity options for cost savings
A metadata-driven ELT framework in Azure Synapse Analytics or in Microsoft Fabric is a way of organizing, optimizing and managing data pipelines that involves using metadata to define and control the flow of data from source to destination. This can be useful for organizations that have a large number of data pipelines and want to have more control over how data is processed and moved between systems.
In this session, we will discuss the benefits of using a metadata-driven approach to managing data pipelines. Our discussion will include practical examples and best practices for implementing a metadata-driven ELT framework in your organization based on the Medallion Lakehouse architecture. We will also provide you with code samples and walk them through how to get started with implementing this framework in your own work.
This session is ideal for data engineers and other technical professionals who want to learn how to optimize their data pipelines in Azure Synapse Analytics or Fabric. By the end of the session, you will have a clear understanding of how to use a metadata-driven approach to manage and maintain data pipelines, enabling better control and visibility over their data processes.
In this open session, it's your chance to ask our panel of 3 MVPs about data governance for your business using Azure Purview
Microsoft Purview's Data Policies app is a powerful tool for managing access to data systems across your organization's entire data estate. In this session, we will explore the benefits of using data policies to manage access to data, and how they can help you to streamline and scale access provisioning.
The big advantage of these policies is that you do not have to apply RBAC roles and that you have a single-pane of glass which is a cloud-based experience that enables at-scale provisioning of access to data.
Are you a data consumer? Then the self-service access policy is an easy way to request access to data while browsing or searching for data.
Are you a data producer? Then access policies will help you to easily create and publish access to data sources.
Are you a DBA or Developer? Then DevOps policies are a simple, central, cloud-based experience that allows you to provision access at scale to DBAs and other DevOps users
In this session, I will explain and show you how to create and publish data policies, devops policies and how to set up a self-service access workflow.
So, I don't have to write any code to build up my facts and dimensions, yes you have read that correctly.
Within Azure Synapse Analytics, a new functionality/tool is available, the map data tool. The map data tool allows you to easily map your Data from a source into the target tables in the Synapse Lake Database.
Map Data is a guided experience where you can generate a mapping data flow without having to start from a blank canvas. Once you have created the mappings then you can easily generate a scalable mapping data flow in a Synapse Pipeline.
After you have published the Synapse Pipelines, you can run these Pipelines and then visualize your generated data model in Power BI? Sounds great or not?
I will show you how the map data tool works and how to visualize the data in Power BI afterwards in a step-by-step demo-based session. After this session you will have the knowledge to build and visualize your first Synapse Lake Database.
In this session, I will give you a Comprehensive Overview of Microsoft Purview, a unified data governance solution designed to help organizations manage their data more effectively.
Data governance is critical to any organization's data strategy as it ensures data quality, security, and compliance. However, it can be a complex process, especially when managing large volumes of data across various systems and platforms.
During this session, we will be discussing the key features of Microsoft Purview, including data discovery and classification, data cataloging and management, and data lineage and mapping. We will also explore how Purview integrates with other Azure services such as Azure Synapse Analytics, Azure Data Factory, and Azure Databricks to enable end-to-end data governance.
This session is perfect for data professionals, architects, and CDO's who want to learn more about how Purview can help them overcome their data governance challenges.
By the end of the session, you will have a clear understanding of how Purview can streamline data governance and enhance data quality, which will ultimately lead to better decision-making and collaboration within their organizations.
Microsoft Purview brings together data governance from Microsoft Data and AI, along with compliance and risk management from Microsoft Security and is now complemented with many other solutions
But what's in for me as an organization?
• Which solutions does Microsoft Purview actually include?
• Which solutions can be easily deployed in my organization?
• Which portals can I use now and for which solution?
An agreeable series of questions that we will answer during this session. At the end of the session, you will have an answer on what Microsoft Purview could mean in your organization.
The use of data Lineage is a hot topic for many organizations.
Many organizations struggle with answers to the following questions:
• I want to adjust a measure, but where do I have to adjust it and where does the data come from?
• What will be the effect on my data if I rename this column in the source?
• Can I visually overview my Data Estate including how the data has been transformed?
In this session, we will explore the concept of data lineage and how it can be used in Microsoft Purview to provide a comprehensive understanding of data flows and data transformation across various data sources. We will cover how to create and visualize data lineage in Purview, including the use of Purview scanners and connectors, and the integration with Azure Synapse Analytics and how to use Custom Lineage components for unsupported data sources with Apache Atlas.
The session is intended for data professionals, such as data architects, data engineers, data analysts, and data scientists, who are interested in using Microsoft Purview for data governance and management. The session would be beneficial for those who are already familiar with basic concepts of data management and would like to learn more about using data lineage in Purview.
Azure Synapse Analytics is Microsoft's analytical engine that brings together data integration, enterprise data warehousing, and big data analytics.
As we now take a more holistic approach, more different types of user groups will use the platform. The more important the setup of an authorization matrix in advance will be. The following topics will be covered during this session:
• What Azure AD roles do we need to deploy an Azure Synapse Workspace?
• How can we simplify access control by using security groups that are aligned with people's job roles.
• How do we handle different user personas in Azure Synapse Analytics? For example, what is a Data Scientist or Data Engineer allowed to do and what not?
What access control settings do we need to have to:
• Store code in Azure Devops
• Debug a pipeline
• Run a Notebook
During this session I would like to take you through some practical examples on how you can set up these roles for your Azure Synapse in order to get in to control of your environment.
In this Lightning talk I explain how can you scale up or down your SQL Pool in Azure Synapse Analytics using an Synapse Pipeline. An easy way so save some cost in your Analytics Environment
In this session, we will explore how to create an Azure Synapse environment, step by step. Azure Synapse Analytics is a cloud-based analytics service that brings together big data and data warehousing. It offers a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs.
During this session, I will cover the pre-requisites necessary for creating an Azure Synapse environment. You will be guided through the steps to create an Azure Synapse Workspace and to create Synapse SQL, Serverless and Spark pools for data processing and ingestion. We'll also explore the various methods for loading data into Synapse.
Finally, I will demonstrate how you easily can query your data.
By the end of this session, you will have a clear understanding of how to create an Azure Synapse environment, load data into it, and query the data efficiently.
Data Community Austria Day 2024 Upcoming
Erwin de Kreuk
Data Platform MVP | Lead Data and AI |Public Speaker | InSpark | Innovate to Accelerate