Data processing with Databricks SQL

Using SQL for data transformation is a powerful way to empower an analytics team to create their own optimized data model. However, relying on SQL often comes with tradeoffs such as limited functionality, hard to maintain stored procedures, and skipping best practices like version control and data tests. While Azure Databricks is known as a platform for using Apache Spark with big data workloads, it has increased support for SQL workloads over the last few years. Attend this session to hear how Azure Databricks supports SQL for data transformation jobs as a core part of your Lakehouse.

In this session we will cover three options to use Azure Databricks with SQL syntax to create Delta tables.
1. Databricks workflows with Streaming Tables, Materialized Views, and SQL scripts
2. dbt: an open source framework to apply engineering best practices to SQL based data transformations.
3. SQLMesh: an open-core product to easily build high-quality and high-performance data pipelines

You will leave with a better understanding of how your team can use SQL to build complex data transformations for your Lakehouse and still have confidence in the reliability and scalability of your ETL pipeline.

Dustin Vannoy

Specialist - Data Engineering/DevX at Databricks

San Diego, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Data processing with Databricks SQL

Dustin Vannoy

Links

Actions