Nick Baker
Sr. Analytics Engineer @ Spotify
Actions
Nick started his data journey attempting to automate finance and accounting functions at LUXTECH, a LED lighting startup in Philadelphia. After that, he worked as an analyst and data scientist at Boxed, where he led the effort to implement dbt, and as an analytics engineer at Venmo. Now, Nick supports podcast analytics and builds tools to empower other analytics engineers to adopt dbt for their teams at Spotify.
How Content Analytics at Spotify leverages dbt to strategically ingest and export data in GCS
aka Avoiding Data Indigestion:
Spotify has been a power GCS user since the early days and as a result, we have built the majority of our data ecosystem leveraging an internal data transformation tool that writes data with sharded partitions. Upon adopting dbt as our team’s primary data transformation tool, we were faced with the challenge of strategically accessing data produced by other teams. In order to do this, we developed an internal package called Waluigi (the opposite of Luigi) with a variety of options to access a specific partition, the most recent partition, or a list or range of partitions. The tables we write out of dbt are all natively partitioned, so as more teams shift from our internal transformation tool to dbt, we had to build similar access strategies to work with natively partitioned tables. Not only is this allowing our team to efficiently and safely access the data we need, but it further empowers other teams to adopt dbt and leverage the data we produce.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top