© Mapbox, © OpenStreetMap

Most Active Speaker

Paul Andrew

Paul Andrew

Co-Founder & CTO of Cloud Formations | Microsoft MVP

Derby, United Kingdom

Actions

Paul (AKA @mrpaulandrew) is the Co-Founder & CTO of Cloud Formations, a specialist data consultancy based in the UK. With nearly 20 years’ experience designing and delivering Microsoft data architectures, Paul leads a passionate team of engineers, supporting businesses small and large with scalable cloud platforms. Business value delivered through data insights. Over the years, Paul has covered the breadth and depth of design patterns and industry leading concepts, including Lambda, Kappa, Delta Lake, Data Mesh and Data Fabric.

Paul is also a Microsoft Data Platform MVP, director for the Data Relay community conference, East Midlands user group leader, book author and mentor. In addition to the day job(s), Paul is a father of three, husband, foodie, runner, blood donor, geek, Lego, and Star Wars fan! Lastly, Paul confesses to enjoying a Ramstein playlist when given half a chance to do some coding for a customer project.

Awards

Area of Expertise

  • Information & Communications Technology

Topics

  • Azure Data Platform
  • data mesh
  • Big Data
  • Analytics and Big Data
  • Azure Data Factory
  • Azure SQL Database
  • data engineering
  • Azure Data & AI
  • Azure Data Lake
  • Data Platform
  • Data Warehousing
  • Microsoft Data Platform
  • Data Analytics
  • All things data
  • Data Visualization
  • Databases

Deciphering Data Architectures full-day workshop

This pre-conference workshop will begin by defining 'big data' and clarifying various data architecture concepts to establish a solid foundation of understanding before delving into specific data architectures. Topics to be covered include relational data warehouses, data lakes, data marts, data virtualization, and the differences between ETL and ELT. James will then explore and compare the architectures of the Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh in considerable detail, highlighting their advantages and disadvantages. While these concepts may seem appealing in theory, James will address potential concerns to consider before implementation. This workshop aims to demystify these complex topics, offering ample opportunity for questions. The content is derived from James's book "Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh."

A Data Engineer's Guide to Platform Governance in Microsoft Fabric

Capacities, Workspaces, Domains, DevOps & more!

Immerse yourself in the intricacies of Microsoft Fabric with a specialized training day tailored entirely to Data Engineers. While Thursday, February 6th, offered a glimpse into various Fabric topics, our upcoming training day on Friday, February 7th is your opportunity to delve deeper into the platform.

Your trainers are experts, who have been working in the subject area for years, and who will guide you through unlocking the full potential of Fabric through the lens of a Data Engineer. They will focus on the practical aspects, based on their experience of collaborating with customers worldwide.

We will cover core topics, including:

Administering Fabric: The breadth of Fabric Experiences means there is a huge amount of platform configuration we could do beyond the defaults. The admin portal requires a map just to navigate the array of switches and levers. Let’s explore the impact of these together.

Capacity Management: Provisioning the right amount of compute for our workloads is tricky, especially when that compute can burst beyond the initial sizing. Next, we need to get the business to sign off on those theoretical costs. Lastly, we need to figure out how to allocate this to users, workspaces, analytics solutions and environments. Is one capacity enough. Let’s address these points together to arrive at some rational/pragmatic values.

Workspace Organisation and Configuration: Preventing environment spaul and structure for business users less familiar with development practices, including the assignment of compute, storage and the ability to build out Fabric items across the experiences.

Item Source Control, Environments and Deployment: Understanding how to support continuous integration and deliver across Fabric Workspaces, with alignment to a medallion taxonomy where applicable. Covering different approaches to change management.

Structuring Domains and Sub-domains: Considering data mesh principles, how should we apply this thinking to the technical capabilities of Microsoft Fabric, including the organisation/manifestation of our business data products.

Data Governance Practices: Once an analytics solution has been built in Fabric how do you allow business users to explore and interact with the outputs. Data cataloguing, lineage and item endorsement needs to be considered alongside culture and adoption for it to truly fly.

Administering Microsoft Fabric: Capacities, Workspaces, and Domains

In this session, we will explore the administration of Microsoft Fabric, with a focus on the organisation and management of data storage/compute through the configuration of capacities, domains, data products, environments and workspaces. We will discuss the application of data mesh and data fabric concepts in the context of Microsoft Fabric capabiltities, including organising data products for effective delivery to business users.

Additionally, we will demonstrate how to use domains and separate workspaces to serve reports to business users, providing them with the information they need to make informed decisions. Aligning industry governance standards to Microsoft Fabric features and access controls. Join us to learn how to effectively administer Microsoft Fabric beyond the simple Workspaces inherited from Power BI.

Microsoft Data Integration Pipelines – The Fundamentals to Level 300

In this full day of training, we will start with the very basics of data integration before building up the fundamental skills and artifacts needed to orchestrate your data platform end-to-end. We will do this using Azure Data Factory pipelines, not exclusively, but as the basis for our learning journey. The maturity of the Data Factory resource offers a set of good foundations to understand the technical capabilities needed for orchestration, before applying our knowledge to other tools such as Microsoft Fabric and Azure Synapse Analytics.

Through a blend of theory, demonstration, and practical labs we will explore the components needed to orchestration cloud-based data processing workloads from source system ingestion, through to data model delivery. Focusing on the control flow plane, with supplementary options for building out low code data flow packages available as part of our integration pipelines.

Start the day knowing nothing about data integration pipelines, or as an experienced SQL Server SSIS developer, and leave with the knowledge and resources to apply these skills to your role as a data professional working with Microsoft cloud native tools.

An Engineers Guide To Realtime Data Handling And Analytics

The velocity of data is getting faster across many industries, fuelled by the business demand to gain insights and value from sources in near real-time. This necessity is then allowing decision makers to pivot and ultimately stay ahead of the competition. Furthermore, the growth of the internet of things and ‘smart’ devices now means the volume of that high velocity data has exploded. Meeting this demand requires new concepts and new designs for data/solution architects, with high throughput ingestion endpoints and query stream tools that can perform aggregations ‘on the fly’.

In this course, we will address the above head on. Discussing and designing architectures that can scale and burst for high throughput events. Querying using both SQL and KQL to blend stream and batch data feeds for downstream reporting.

As a platform, we’ll use Azure Event Hub and Azure Stream Analytics to ingest and handle that initial data stream. Before applying the same patterns to other resources in Microsoft Fabric and Azure Data Explorer. Understanding the patterns to apply as an architect vs the tooling available for delivery.

How Can Microsoft Fabric Have an Impact on Your Business?

All this talk about Data-Ware-Lake-Delta-Beach-House-Lakes (or some combination of that) and Data, Yarn, Fabric integration, everything has got a bit… Meshy! Yes, my friends. The beat of the technology drum is certainly relentless. And with no limits cloud scale and huge innovations from the biggest brains. Two years, it seems, has become the benchmark for tools to live and die by. Reach three years and you almost have a mature product. That said, Microsoft Fabric, the latest offering from global software giant is no exception. But what does this mean for the real world. For the data analysts, engineers and scientists that need to continue answering everyday problems to inform business decisions. In this session we will firmly ignore the hype and focus on the reality. With the pragmatic view of an experienced architect. The problem of gaining insights from our data hasn’t changed. So, what does this mean if implemented using Microsoft Fabric. What, why and how is the tooling going to change our daily deliverables in the short term, medium and long term. Join me for these answers and more as we explore the impact of Microsoft Fabric-Server, erm, Power. Resource. Thing!

From Theory to Practice - Building a Data Mesh Architecture in Azure

The principals of a data mesh architecture have been around for a while now, but we still don’t have a clear way to deliver such a platform in Azure. Are the concepts so abstract that it’s hard to translate the principals into real world requirements and maybe even harder to think about what technology you might need to deploy within your Azure tenant.

In this session, we’ll explore options for building scalable data products in Azure, following Data Mesh architecture principals. Turning the theory into practice. What data storage technology should be used? Does it matter? What endpoints should be exposed for the products across the overall mesh? And what resource(s) should sit at the centre of the Data Mesh? Answers to all these questions are more as we turn the theory of a Data Mesh architecture into practice. Including, how to dissect the planes of the Data Mesh using Azure concepts.

Azure Data Integration Pipelines - Advanced Design and Delivery

In this full day training data session, we'll quickly cover the fundamentals of data integration pipelines before going much deeper into our Azure resources (Data Factory & Synapse Pipelines). Within a typical Azure data platform solution for any enterprise grade data analytics or data science workload an umbrella resource is needed to trigger, monitor, and handle the control flow for various workloads, with the goal being actionable data insight. Those requirements are met by deploying Azure Data Integration pipelines, delivered using Azure Synapse Analytics or Azure Data Factory. In this session, we will explore how to create rich, dynamic, metadata driven pipelines and apply these orchestration resources in production. Using scaled out architecture design patterns, best practice, data mesh principals, and the latest open source frameworks. We will take a deep dive into the resources, considering how to build custom activities, complex pipelines and think about hierarchical design patterns for enterprise grade deployments. All this and more in a complete set of learning modules, with hand on labs, we will take you through how to implement data integration pipelines in production and deliver advanced orchestration patterns (based on real world experience).

A Data Engineers Guide to Every Azure Data Platform Resource

Maintaining a functional set of knowledge on the breadth and depth of Azure Data Platform resources is hard. There is now so many different ways to execute many different data processing workloads on many different flavours of compute and storage. What should we use, and when? It depends is common answer! However, in this full data of training help is at hand. We will cover the A-Z of (data engineering focused) Azure Data Resources, yes, its depends, but we'll go deeper and learn what it depends on. From the perspective of an experienced solution architect and based on real world implementations, we'll address what to use, when to use it, why and how. Including tips and tricks for the deployment of resources into production along the way. To support this understanding we'll cover a set of use case driven scenarios and the various resources/architecture patterns used to implement them.

The Evolution of Data Platform Architectures in Azure - Lambda, Kappa, Delta, Data Mesh

How has advancements in highly scalable cloud technology influenced the design principals we apply when building data platform solutions? Are we designing for just speed and batch layers or do we what more from our platforms, and who says these patterns must be delivered exclusively? Let’s disrupt the theory and consider the practical application of all things. Can we now utilise Azure technology to build architectures that cater for lambda, kappa and delta concepts in a complete stack of services? And should we be considering a solution that offers all these principals in a nirvana of data insight perfection? In this session we’ll explore the answer to all these questions and more in a thought provoking, argument generating look at the challenges every data platform architect faces.

Creating a Metadata Driven Orchestration Framework Using Azure Data Integration Pipelines

Azure Data Factory and Synapse Integration Pipeline are the undisputed PaaS resources within the Microsoft Cloud for orchestrating data workloads. With a 100+ Linked Service connections, a flexible array of both control flow and data flow Activities there isn't much these pipelines can’t do as a wrapper over our data platform solutions. That said, the service may still require the support of other Azure resources for the purposes of logging, monitoring, compute and storage. In this session we’ll will focus on exactly that point and explore the problem faced when structuring many integration pipelines in a highly scaled architecture.
Once coupled with other resources, we’ll look at one possible solution to this problem of pipeline organisation to create a dynamic, flexible, metadata driven processing framework that complements our existing solution pipelines. Furthermore, we will explore how to bootstrap multiple orchestrators (across tenants if needed), design for cost with nearly free Consumption Plans and deliver an operational abstraction over all our processing pipelines.
Finally, we'll explore delivering this framework within an enterprise and consider an architect’s perspective on a wider platform of ingestion/transformation workloads with multiple batches and execution stages.

Building an Azure Data Analytics Platform End-to-End

The resources on offer in Azure are constantly changing, which means as data professionals we need to constantly change too. Updating knowledge and learning new skills. No longer can we rely on products matured over a decade to deliver all our solution requirements. Today, data platform architectures designed in Azure with best intentions and known design patterns can go out of date within months. That said, is there now a set of core components we can utilise in the Microsoft cloud to ingest, curation and deliver insights from our data? When does ETL become ELT? When is IaaS better than PaaS? Do we need to consider scaling up or scaling out? And should we start making cost the primary factor for choosing certain technologies? In this session we'll explore the answers to all these questions and more from an architect’s viewpoint. Based on real world experience let’s think about just how far the breadth of our knowledge now needs to reach when starting from nothing and building a complete Microsoft Azure Data Analytics solution.

Implementing a Data Mesh Architecture in Azure

The principals of a data mesh architecture have been around for a while now, but we still don’t have a clear way to deliver such a solution in Azure. Are the concepts so abstract that it’s hard to translate the principals into real world requirements and maybe even harder to think about what technology you might need to deploy in your Azure resource groups. In this session, we’ll explore options for building an Azure data platform, following Data Mesh principals. What data storage technology should be used? What endpoints should be exposed for mesh interfacing and what resource(s) should sit at the centre of the Data Mesh? Answers to all these questions are more as we turn the theory of a Data Mesh architecture into practice.

Azure Synapse Analytics - The Technology vs The Workspace Abstraction

For those that have been using Azure data platform resources for a while the unified Synapse Analytics Workspace experience makes a lot of sense. However, for those that are new to Azure translating the technology requirements to a given use case can be hard. In this short sharp session, we’ll look at what each of the Synapse Analytics tools can do for our data workloads. We’ll decrypt the workspace experience into simple compute and storage components regardless of how you choose to ‘develop’ or ‘integrate’ your data. Let’s remove the pretty UI abstraction, what is the technology I’m working with underneath.

An Introduction to Delta Lakes and Delta Lake-Houses

Once upon a time, there was a data warehouse and it lived happily as a set of tables within our relational database management system (RDMS) called Microsoft SQL Server. The data warehouse had three children known as extract, transform, and load. One day a blue/azure coloured cloud appeared overhead, and it started to rain. The data warehouse got wet and was never the same again! Or was it? Spoiler alert, the data warehouse is the same, still happy, and well, it just evolved and moved from its RDMS home to a new home in the cloud. The end!
In this session, we'll look at the evolution of the data warehouse and understand how we can now deliver the same data engineering concepts for our solutions on the Microsoft Azure cloud platform using the open-source Delta.io standard. We'll introduce the standard (originally developed by Databricks) and then explore the implications it has for our next-generation cloud data warehouse.
The original data warehouse set of tables remain, but now they are delivered using the cloud-native Delta Lake technology with distributed storage/compute as standard. Delta.io gives us those much-needed ACID properties over our data lakes meaning our data warehouse understanding can move to the cloud and is made easier within Azure. The data warehouse just grew up and became a Delta Lake-House.

Azure Data Integration Pipelines - A Complete Introduction

Azure Data Factory along with other Integration Pipeline technologies is now a core resource for any data platform solution, offering critical control flow and data flow capabilities. In this session we’ll take and end-to-end look at our Azure based data pipeline tools when orchestrating highly scalable cloud native services. In this complete introduction session, we will cover the basics of Azure Data Factory and Azure Synapse Analytics Pipelines. What do we need to build cloud ETL/ELT workloads? What’s the integration runtime? Do we have an SSIS equivalent cloud data flow engine? Can we easily lift and shift existing SSIS packages into the cloud? The answers to all these questions and more. Come to this session knowing nothing about Azure Data Integration Pipelines and leave with enough knowledge to start building pipelines tomorrow.

An Introduction to Azure Synapse Analytics - What is it? Why use it? And how?

The Microsoft abstraction machine is at it again with this latest veneer over what we had come to understand as the ‘modern data warehouse’. Or is it?! When creating an Azure PaaS data platform/analytics solution we would typically use a set of core Azure services; Data Factory, Data Lake, Databricks and maybe SQL Data Warehouse. Now with the latest round of enhancements from the MPP team and others it seems in the third generation of the Azure SQLDW offering we can access all these core services as a bundle. We might even call it a Kappa architecture! Ok, so what? Well, this is a reasonable starting point in our understanding of what Azure Synapse Analytics is, but it is also far from the whole story. In this session we will go deeper into the evolution of our SQLDW to complete our knowledge on why Synapse Analytics is a game changer for various data platform architectures. We’ll discover what Synapse has to offer with its Data Virtualisation layer, flexible storage, and variety of compute engines. A simple veneer of things, this new resource is not. In this introduction to Synapse we will cover the what, that why and importantly the how for this emerging bundle of exciting technology. Finally, we’ll touch on Microsoft’s latest thinking for a HTAP environment with direct links into our transactional data stores.

Implementing Azure Data Integration Pipelines in Production

Within a typical Azure data platform solution for any enterprise grade data analytics or data science workload an umbrella resource is needed to trigger, monitor, and handle the control flow for transforming datasets. Those requirements are met by deploying Azure Data Integration pipelines, delivered using Synapse Analytics or Data Factory. In this session I'll show you how to create rich dynamic data pipelines and apply these orchestration resources in production. Using scaled architecture design patterns, best practice and the latest metadata driven frameworks. In this session we will take a deeper dive into the service, considering how to build custom activities, dynamic pipelines and think about hierarchical design patterns for enterprise grade deployments. All this and more in a series of short stories (based on real world experience) I will take you through how to implement data integration pipelines in production.

Azure Data Integration Pipelines – The Fundamentals to Level 300

In this full day of training, we’ll start with the very basics, learning how to build and orchestrate common pipeline activities. You will learn how to build out Azure control flow and data flow components as dynamic processing pipelines using Azure Data Factory and Azure Synapse Analytics. We’ll start by covering the fundamentals within the resources and together build a set of pipelines that ingest data from local source systems, transform and serve it to potential consumers. Through a set of 12 carefully constructed learning modules we will take an end-to-end look at our Azure integration pipeline tools as part of highly scalable cloud native architectures, dealing with triggering, monitoring, dynamic pipeline content as well as CI/CD practices. Start the day knowing nothing about Azure Data Integration pipelines and leave with the knowledge, slides, labs, demos, and code to apply these resources in your role as a data professional. Everything delivered will be use case orientated and grounded in real world experience.

Fast-Track Your Data Platform: How to Build a Metadata Driven Lakehouse

In today’s data-driven world, fast and efficient data platform delivery is crucial for staying ahead of the competition. Join me for a dynamic session that demonstrates how to build a metadata-driven Lakehouse with Microsoft cloud native technologies. Using your preferred compute and storage resources, Azure Data Factory, Azure Databricks, Azure Synapse Analytics or Microsoft Fabric.

Discover how to simplify and overcome common obstacles such as fragmented data ingestion, change data capture, and orchestration scalability using proven best practices. Learn how to leverage automation, open-standards, and seamless cloud integration to accelerate time-to-insight with minimal technical debt. This session is perfect for techies and data leaders alike, seeking to streamline their cloud data platform delivery while maintaining cost control and operational resilience. In summary, unlock the potential to build a Lakehouse in a day by using an open-source metadata driven product accelerator.

In today’s data-driven world, fast and efficient data platform delivery is crucial for staying ahead of the competition. Join me for a dynamic session that demonstrates how to build a metadata-driven Lakehouse with Microsoft cloud native technologies. Using your preferred compute and storage resources, Azure Data Factory, Azure Databricks, Azure Synapse Analytics or Microsoft Fabric.

Data Saturday Gothenburg 2023 Sessionize Event

August 2023 Göteborg, Sweden

Data Platform Next Step 2023 Sessionize Event

June 2023 Billund, Denmark

Techorama 2023 Belgium Sessionize Event

May 2023 Antwerpen, Belgium

SQLDay 2023 Sessionize Event

May 2023 Wrocław, Poland

dataMinds Connect 2022 Sessionize Event

October 2022 Mechelen, Belgium

Data Relay 2022 Sessionize Event

October 2022

Future Data Driven Summit 2022 Sessionize Event

September 2022

DATA BASH '22 Sessionize Event

September 2022

DATA:Scotland 2022 Sessionize Event

September 2022 Glasgow, United Kingdom

SQLBits 2022 Sessionize Event

March 2022 London, United Kingdom

Virtual Scottish Summit 2021 Sessionize Event

February 2021

dataMinds Connect 2020 (Virtual Edition) Sessionize Event

October 2020 Mechelen, Belgium

Data Platform Discovery Day Europe Sessionize Event

April 2020

Data Relay 2019 Sessionize Event

October 2019

DATA:Scotland 2019 Sessionize Event

September 2019 Glasgow, United Kingdom

DataGrillen 2019 Sessionize Event

June 2019 Lingen, Germany

Global Azure Bootcamp 2019 Sessionize Event

April 2019 Birmingham, United Kingdom

Intelligent Cloud Conference 2019 Sessionize Event

April 2019 Copenhagen, Denmark

Global Azure Boot camp - Birmingham UK Sessionize Event

April 2018 Birmingham, United Kingdom

Paul Andrew

Co-Founder & CTO of Cloud Formations | Microsoft MVP

Derby, United Kingdom

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top