Session
The Rise of a Full Stack Data Scientist: Powered by Python
As data scientists, we often rely on the data engineering teams upstream to deliver the right data needed to train ML models at scale. Deploying these ML models as a data application to downstream business users is constrained by one’s web development experience. Using Snowpark, you can build end to end data pipelines, train ML models and build data applications based on your ML model - all from scratch using Python only.
In this talk, you will learn to build a Streamlit data application to help visualize the ROI of different advertising spends of an example organization.
Setup Environment: Use stages and tables to ingest and organize raw data from S3 into Snowflake.
Data Engineering: Leverage Snowpark for Python DataFrames to perform data transformations such as group by, aggregate, pivot, and join to prep the data for downstream applications.
Data Pipelines: Use Snowflake Tasks to turn your data pipeline code into operational pipelines with integrated monitoring.
Machine Learning: Prepare data and run ML Training in Snowflake using Snowpark ML and deploy the model as a Snowpark User-Defined-Function (UDF).
Streamlit Application: Build an interactive application using Python (no web development experience required) to help visualize the ROI of different advertising spend budgets.
Target audience: Any data practitioner who wants to get started with data science. Or experienced data scientists looking to go full-stack. Or any python developers.
Vino Duraisamy
Developer Advocate, Snowflake
San Francisco, California, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top