Session
Building a Serverless Data Lake Platform
This session shares lessons from building a Data Lake Platform using Serverless components in an enterprise setup. We will dig deeper into data lakes from 1st principles, socio technical challenges around adoption and various tradeoffs that were made during the design phase.
I will talk about my experience of building a data lake platform at Polestar cars.
The talk has 4 major parts
1. The business context - I will begin with the use case and cover the technological needs from the project.
2. Why we chose a data lake over other data paradigms? - Here I will explain why we choose a data lake to implement this platform. I will break down the data lake concept into it "logical" components and highlight services from AWS.
3. Architectural walkthrough of our solution
I will explain why we choose to make it into a platform and how the operational model changes with centralisation of this platform over decentralised data lakes.
4. Lessons learnt from the project.
I will talk about the friction we faced for adoption, challenges around data security and governance, training management to change their way of thinking to adopt this.
Key questions answered from the session as takeaways for the participants -
1. When to platform data lakes aka centralise vs decentralise?
2. How to ensure the data lakes remains usable?
3. How to enhance developer experience via self service mechanisms?
4. When should you use a data lake over other data products like data warehouse?
Anurag Kale
AWS Data Hero, Cloud and Data Architect at Aurobay Sweden
Göteborg, Sweden
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top