Most Active Speaker

Paul Andrew

Paul Andrew

Data Platform MVP, Avanade Centre of Excellence Technical Architect specialising in data platform solutions built on Microsoft Azure.

Derby, United Kingdom

Avanade Centre of Excellence (CoE) Technical Architect specialising in data platform solutions built on Microsoft Azure.
Data engineering competencies include Azure Synapse Analytics, Data Factory, Data Lake, Databricks, Stream Analytics, Event Hub, IoT Hub, Functions, Automation, Logic Apps and of course the complete SQL Server business intelligence stack.
15+ years’ experience working within healthcare, retail, manufacturing, and gaming verticals delivering analytics through the definition of industry leading design patterns and technical architectures.
STEM ambassador and very active member of the data platform community delivering training and technical sessions at conferences both nationally and internationally.
Father, husband, swimmer, cyclist, runner, blood donor, geek, Lego and Star Wars fan!

Awards

Area of Expertise

  • Information & Communications Technology

Topics

  • Azure Data Platform
  • data mesh
  • Big Data
  • Analytics and Big Data
  • Azure Data Factory
  • Azure SQL Database
  • data engineering
  • Azure Data & AI
  • Azure Data Lake
  • Data Platform
  • Data Warehousing
  • Microsoft Data Platform
  • Data Analytics
  • All things data
  • Data Visualization
  • Databases

From Theory to Practice - Building a Data Mesh Architecture in Azure

The principals of a data mesh architecture have been around for a while now, but we still don’t have a clear way to deliver such a platform in Azure. Are the concepts so abstract that it’s hard to translate the principals into real world requirements and maybe even harder to think about what technology you might need to deploy within your Azure tenant.

In this session, we’ll explore options for building scalable data products in Azure, following Data Mesh architecture principals. Turning the theory into practice. What data storage technology should be used? Does it matter? What endpoints should be exposed for the products across the overall mesh? And what resource(s) should sit at the centre of the Data Mesh? Answers to all these questions are more as we turn the theory of a Data Mesh architecture into practice. Including, how to dissect the planes of the Data Mesh using Azure concepts.

Azure Data Integration Pipelines - Advanced Design and Delivery

In this full day training data session, we'll quickly cover the fundamentals of data integration pipelines before going much deeper into our Azure resources (Data Factory & Synapse Pipelines). Within a typical Azure data platform solution for any enterprise grade data analytics or data science workload an umbrella resource is needed to trigger, monitor, and handle the control flow for various workloads, with the goal being actionable data insight. Those requirements are met by deploying Azure Data Integration pipelines, delivered using Azure Synapse Analytics or Azure Data Factory. In this session, we will explore how to create rich, dynamic, metadata driven pipelines and apply these orchestration resources in production. Using scaled out architecture design patterns, best practice, data mesh principals, and the latest open source frameworks. We will take a deep dive into the resources, considering how to build custom activities, complex pipelines and think about hierarchical design patterns for enterprise grade deployments. All this and more in a complete set of learning modules, with hand on labs, we will take you through how to implement data integration pipelines in production and deliver advanced orchestration patterns (based on real world experience).

A Data Engineers Guide to Every Azure Data Platform Resource

Maintaining a functional set of knowledge on the breadth and depth of Azure Data Platform resources is hard. There is now so many different ways to execute many different data processing workloads on many different flavours of compute and storage. What should we use, and when? It depends is common answer! However, in this full data of training help is at hand. We will cover the A-Z of (data engineering focused) Azure Data Resources, yes, its depends, but we'll go deeper and learn what it depends on. From the perspective of an experienced solution architect and based on real world implementations, we'll address what to use, when to use it, why and how. Including tips and tricks for the deployment of resources into production along the way. To support this understanding we'll cover a set of use case driven scenarios and the various resources/architecture patterns used to implement them.

The Evolution of Data Platform Architectures in Azure - Lambda, Kappa, Delta, Data Mesh

How has advancements in highly scalable cloud technology influenced the design principals we apply when building data platform solutions? Are we designing for just speed and batch layers or do we what more from our platforms, and who says these patterns must be delivered exclusively? Let’s disrupt the theory and consider the practical application of all things. Can we now utilise Azure technology to build architectures that cater for lambda, kappa and delta concepts in a complete stack of services? And should we be considering a solution that offers all these principals in a nirvana of data insight perfection? In this session we’ll explore the answer to all these questions and more in a thought provoking, argument generating look at the challenges every data platform architect faces.

Creating a Metadata Driven Orchestration Framework Using Azure Data Integration Pipelines

Azure Data Factory and Synapse Integration Pipeline are the undisputed PaaS resources within the Microsoft Cloud for orchestrating data workloads. With a 100+ Linked Service connections, a flexible array of both control flow and data flow Activities there isn't much these pipelines can’t do as a wrapper over our data platform solutions. That said, the service may still require the support of other Azure resources for the purposes of logging, monitoring, compute and storage. In this session we’ll will focus on exactly that point and explore the problem faced when structuring many integration pipelines in a highly scaled architecture.
Once coupled with other resources, we’ll look at one possible solution to this problem of pipeline organisation to create a dynamic, flexible, metadata driven processing framework that complements our existing solution pipelines. Furthermore, we will explore how to bootstrap multiple orchestrators (across tenants if needed), design for cost with nearly free Consumption Plans and deliver an operational abstraction over all our processing pipelines.
Finally, we'll explore delivering this framework within an enterprise and consider an architect’s perspective on a wider platform of ingestion/transformation workloads with multiple batches and execution stages.

Building an Azure Data Analytics Platform End-to-End

The resources on offer in Azure are constantly changing, which means as data professionals we need to constantly change too. Updating knowledge and learning new skills. No longer can we rely on products matured over a decade to deliver all our solution requirements. Today, data platform architectures designed in Azure with best intentions and known design patterns can go out of date within months. That said, is there now a set of core components we can utilise in the Microsoft cloud to ingest, curation and deliver insights from our data? When does ETL become ELT? When is IaaS better than PaaS? Do we need to consider scaling up or scaling out? And should we start making cost the primary factor for choosing certain technologies? In this session we'll explore the answers to all these questions and more from an architect’s viewpoint. Based on real world experience let’s think about just how far the breadth of our knowledge now needs to reach when starting from nothing and building a complete Microsoft Azure Data Analytics solution.

Implementing a Data Mesh Architecture in Azure

The principals of a data mesh architecture have been around for a while now, but we still don’t have a clear way to deliver such a solution in Azure. Are the concepts so abstract that it’s hard to translate the principals into real world requirements and maybe even harder to think about what technology you might need to deploy in your Azure resource groups. In this session, we’ll explore options for building an Azure data platform, following Data Mesh principals. What data storage technology should be used? What endpoints should be exposed for mesh interfacing and what resource(s) should sit at the centre of the Data Mesh? Answers to all these questions are more as we turn the theory of a Data Mesh architecture into practice.

Azure Synapse Analytics - The Technology vs The Workspace Abstraction

For those that have been using Azure data platform resources for a while the unified Synapse Analytics Workspace experience makes a lot of sense. However, for those that are new to Azure translating the technology requirements to a given use case can be hard. In this short sharp session, we’ll look at what each of the Synapse Analytics tools can do for our data workloads. We’ll decrypt the workspace experience into simple compute and storage components regardless of how you choose to ‘develop’ or ‘integrate’ your data. Let’s remove the pretty UI abstraction, what is the technology I’m working with underneath.

An Introduction to Delta Lakes and Delta Lake-Houses

Once upon a time, there was a data warehouse and it lived happily as a set of tables within our relational database management system (RDMS) called Microsoft SQL Server. The data warehouse had three children known as extract, transform, and load. One day a blue/azure coloured cloud appeared overhead, and it started to rain. The data warehouse got wet and was never the same again! Or was it? Spoiler alert, the data warehouse is the same, still happy, and well, it just evolved and moved from its RDMS home to a new home in the cloud. The end!
In this session, we'll look at the evolution of the data warehouse and understand how we can now deliver the same data engineering concepts for our solutions on the Microsoft Azure cloud platform using the open-source Delta.io standard. We'll introduce the standard (originally developed by Databricks) and then explore the implications it has for our next-generation cloud data warehouse.
The original data warehouse set of tables remain, but now they are delivered using the cloud-native Delta Lake technology with distributed storage/compute as standard. Delta.io gives us those much-needed ACID properties over our data lakes meaning our data warehouse understanding can move to the cloud and is made easier within Azure. The data warehouse just grew up and became a Delta Lake-House.

Azure Data Integration Pipelines - A Complete Introduction

Azure Data Factory along with other Integration Pipeline technologies is now a core resource for any data platform solution, offering critical control flow and data flow capabilities. In this session we’ll take and end-to-end look at our Azure based data pipeline tools when orchestrating highly scalable cloud native services. In this complete introduction session, we will cover the basics of Azure Data Factory and Azure Synapse Analytics Pipelines. What do we need to build cloud ETL/ELT workloads? What’s the integration runtime? Do we have an SSIS equivalent cloud data flow engine? Can we easily lift and shift existing SSIS packages into the cloud? The answers to all these questions and more. Come to this session knowing nothing about Azure Data Integration Pipelines and leave with enough knowledge to start building pipelines tomorrow.

An Introduction to Azure Synapse Analytics - What is it? Why use it? And how?

The Microsoft abstraction machine is at it again with this latest veneer over what we had come to understand as the ‘modern data warehouse’. Or is it?! When creating an Azure PaaS data platform/analytics solution we would typically use a set of core Azure services; Data Factory, Data Lake, Databricks and maybe SQL Data Warehouse. Now with the latest round of enhancements from the MPP team and others it seems in the third generation of the Azure SQLDW offering we can access all these core services as a bundle. We might even call it a Kappa architecture! Ok, so what? Well, this is a reasonable starting point in our understanding of what Azure Synapse Analytics is, but it is also far from the whole story. In this session we will go deeper into the evolution of our SQLDW to complete our knowledge on why Synapse Analytics is a game changer for various data platform architectures. We’ll discover what Synapse has to offer with its Data Virtualisation layer, flexible storage, and variety of compute engines. A simple veneer of things, this new resource is not. In this introduction to Synapse we will cover the what, that why and importantly the how for this emerging bundle of exciting technology. Finally, we’ll touch on Microsoft’s latest thinking for a HTAP environment with direct links into our transactional data stores.

Implementing Azure Data Integration Pipelines in Production

Within a typical Azure data platform solution for any enterprise grade data analytics or data science workload an umbrella resource is needed to trigger, monitor, and handle the control flow for transforming datasets. Those requirements are met by deploying Azure Data Integration pipelines, delivered using Synapse Analytics or Data Factory. In this session I'll show you how to create rich dynamic data pipelines and apply these orchestration resources in production. Using scaled architecture design patterns, best practice and the latest metadata driven frameworks. In this session we will take a deeper dive into the service, considering how to build custom activities, dynamic pipelines and think about hierarchical design patterns for enterprise grade deployments. All this and more in a series of short stories (based on real world experience) I will take you through how to implement data integration pipelines in production.

Azure Data Integration Pipelines – The Fundamentals to Level 300

In this full day of training, we’ll start with the very basics, learning how to build and orchestrate common pipeline activities. You will learn how to build out Azure control flow and data flow components as dynamic processing pipelines using Azure Data Factory and Azure Synapse Analytics. We’ll start by covering the fundamentals within the resources and together build a set of pipelines that ingest data from local source systems, transform and serve it to potential consumers. Through a set of 12 carefully constructed learning modules we will take an end-to-end look at our Azure integration pipeline tools as part of highly scalable cloud native architectures, dealing with triggering, monitoring, dynamic pipeline content as well as CI/CD practices. Start the day knowing nothing about Azure Data Integration pipelines and leave with the knowledge, slides, labs, demos, and code to apply these resources in your role as a data professional. Everything delivered will be use case orientated and grounded in real world experience.

dataMinds Connect 2022

October 2022 Mechelen, Belgium

Data Relay 2022

October 2022

DATA BASH '22

September 2022

DATA:Scotland 2022

September 2022 Glasgow, United Kingdom

SQLBits 2022

March 2022 London, United Kingdom

dataMinds Connect 2020 (Virtual Edition)

October 2020 Mechelen, Belgium

Data Relay 2019

October 2019

DATA:Scotland 2019

September 2019 Glasgow, United Kingdom

DataGrillen 2019

June 2019 Lingen, Germany

Global Azure Bootcamp 2019

April 2019 Birmingham, United Kingdom

Intelligent Cloud Conference 2019

April 2019 Copenhagen, Denmark

Global Azure Boot camp - Birmingham UK

April 2018 Birmingham, United Kingdom

Paul Andrew

Data Platform MVP, Avanade Centre of Excellence Technical Architect specialising in data platform solutions built on Microsoft Azure.

Derby, United Kingdom