Session
Databricks Platform Engineering - The No Man's Land
"No Man's Land" typically refers to areas that are not controlled by either side in a conflict. In the realm of data platforms, this can signify the unclear boundaries and responsibilities between DevOps and Data Engineering teams. There may be confusion about roles, leading to a lack of collaboration or misalignment in goals.
During this session, I will attempt to debunk the myth that data platforms must be built in ways that diverge from the standards widely accepted by DevOps and Cloud specialists. We will explore the reasons behind the reluctance of DevOps Engineers to engage in Data projects and the common disregard among Data Engineers for best practices from the Cloud Engineering and DevOps domains.
I will present an example of a scalable data platform architecture based on Azure Databricks, focusing on automation and scalability. Key topics will include Networking, Security, Cost Management, and Access Management, often referencing the Cloud Adoption Framework and its Cloud Scale Analytics component. We will cover the core components of an Azure Databricks solution, dividing them into central (Account, Unity Catalog) and local (Databricks Workspace) elements. Our approach will adhere to the "Everything as Code" philosophy, starting with Infrastructure as Code (IaC) tools like Terraform and Bicep, and extending to Databricks Asset Bundles wrapped in mature CI/CD processes.
We will also discuss the skills that a Cloud/DevOps Engineer should possess, beyond the usual standards, to successfully implement a project for such a platform in accordance with these principles.
In the practical part of the session, I will share lessons learned from the past few years of working on the implementation and optimization of such platforms. I will discuss mistakes made at various stages of building and deploying platforms, as well as best practices and solutions that, developed over time, have enabled us to deploy and standardize projects faster while continuously improving their quality.
Tomasz Kostyrka
Data Platform Architect, GetInData | Part of Xebia; Databricks Champion
Kraków, Poland
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top