Barney Lawrence
Consultant focusing on data analytics and engineering on the Microsoft platform
Chesterfield, United Kingdom
Actions
Barney Lawrence has over a decade's experience helping people make the most of their data on the Microsoft data platform.
Having moved from the guy who knows Excel, to Data Analyst, to Accidental BI Developer and finally to Actual BI Developer, he currently works as a Senior Consultant for Simpson Associates focussing on local government and health data.
Links
Area of Expertise
Topics
Splinking in a Winter Wonderland
Everyone knows, in his home full of ice. Santa lists children both naughty and nice.
But the lists aren't exclusive, most children are both. And the scripts made to pair them were as slow as a sloth.
So Santa decided a plot he would hatch. He'd ask for some new code to make records match.
He instructed his elves "Make it fast, make it cheap". "Fix it by Christmas" he said with a leap.
So the wisest of elves, he had a big think, he decided on Python and a package named splink!
It pairs up the records, with methods statistic, to match them all up through routes probabilistic!
The process is faster than the one of Saint Nicks'. We can run it on Synapse or perhaps Databricks.
And so in this session we'll tell you their tale. How the elves learnt to link records, not fail.
They know what is what, they can tell who is who, and after our story perhaps you can too!
What's What and Who's Who? Cross System Record Linkage with Fabric
Finding records representing the same real-world entity either in multiple sources or duplicated within a single system is a common pain point when seeking to build a single definitive view of the truth of your data across. Variances in format, changes over time, and data entry errors mean connecting our Tom, Dick, and Harries in one system to our Thom, Dirk, and Harriets in another can a big challenge.
This session introduces one approach to solving this problem using Fabric notebooks to run Splink, an open-source Python library developed by the Ministry of Justice in the UK to implement probabilistic record linkage.
This session will introduce you to the basics of deterministic, fuzzy, and probabilistic matching of records and then dive in with an example notebook to create and apply a match an example data set and how to analyse the results.
By the time we're done you'll be ready to take what we've learned back home with you and start applying it to your own data.
How Do I Transform Thee? Let Me Count the Ways (in Fabric).
There are more tools and approaches available to transform your data in Microsoft Fabric than you can shake a spear at. From low and no code through to multiple full code-based options, sometimes themselves with multiple syntaxes. Choosing the wrong one could lead to a comedy of errors, so how do you make sure you end up with your data as you like it?
In this session we will take the common task of building a table of person data for a data model and implement it multiple times with a different approach each time including using SQL, Data Flows, Data Wrangler, PySpark and, simply asking Copilot to do it for us.
By the end of the session, you will have an overview of the data transformation tools Fabric has in its belt, which ones suit which tasks and audiences best along with the strengths and weaknesses of each. This will help you to choose the right tool for the right job without causing a tempest.
The Metadata Driven Data Warehouse
Tired of writing the same old Insert/Update/Delete statements over and over? Is your pipelines folder overflowing? Need to get your new data warehouse up and running as fast as possible? Simply need to figure out what the contents of a table looks like?
Metadata (data about data) comes to the rescue and in this session we’ll show you how.
We’ll walk through the process of building a data warehouse using meta data every step of the way. From analysing what’s in source systems through driving your pipelines and merging new data into the data warehouse all the way up to creating measures in Power BI.
By the end of the session you’ll be ready to start finding new ways to work with your data through the magic of metadata.
An Introduction to Window Functions
For anyone with a basic understanding of SQL or beyond, window functions are a key feature to open your querying options and take your code to the next level.
Most functions are like a holiday in Vegas, what goes on in a row stays in a row, but window functions throw the curtains wide open, making the rows in your queries aware of what happens around them.
In this session we'll start with the basics of syntax and simple examples such as sequencing rows and aggregating over different levels, then move on to more advanced uses including running totals, common query patterns and tricks Finally we'll consider performance impacts for better and worse.
We'll use SQL Server for our examples, but content will be adaptable to other dialects of SQL.
A DISTINCT Disadvantage
DISTINCT - The keyword that sweeps away your duplicate data worries or a performance time bomb waiting to happen?
This short session will look at what problems DISTINCT may cause, when to use it, when not to use it and the alternative approaches you can take that will perform better.
An Introduction to Querying XML With T-SQL
XML is everywhere, it's probably already in your database and sooner or later you're going to need to get at the data inside it.
This session will, in as short a time as possible, show you the key methods needed to start taking data out of a piece of XML and into your queries.
DAX Explained Through Dance, Memes and Dad Jokes
Moving beyond the basics of DAX can as tough as learning to tap-dance without falling in the sink. Several key concepts such as row and filter contexts that you will need to understand must be grasped before more complex expressions can be confidently written.
This session will help anyone who is looking to move their formulae up to the next level (or looking to support others in doing the same) by providing novel approaches to understanding these important concepts.
Together we will explore filter contexts with punchlines, learn how row contexts are like a dance, and discover what CALCULATE has in common with Oprah Winfrey.
T-SQL - Beyond the Basics
This session is aimed at anyone with basic level T-SQL experience or above looking to improve their code; especially when writing long, complex queries.
We will cover topics including getting the most out of Management Studio, Common Table Expressions, the APPLY operator, Window Functions, methods of pivoting and unpivoting data and more.
Through this you will expand your tool-belt of techniques to help you produce T-SQL code that's not just faster to run but also easier to read and faster to write.
SQLBits 2024 - General Sessions Sessionize Event
Festive Tech Calendar 2023 Sessionize Event
New Stars of Data #6 Sessionize Event
Data Relay 2023 Sessionize Event
SQLBits 2023 - General Sessions Sessionize Event
Data Relay 2022 Sessionize Event
PASS Data Community Summit 2021 Sessionize Event
DataMinutes #1 Sessionize Event
Data Céilí 2021 Sessionize Event
Data Relay 2019 Sessionize Event
Barney Lawrence
Consultant focusing on data analytics and engineering on the Microsoft platform
Chesterfield, United Kingdom
Links
Actions
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top