Vino Duraisamy
Developer Advocate, Snowflake
San Francisco, California, United States
Actions
Vino is a Developer Advocate for Snowflake. She started as a software engineer at NetApp, and worked on data management applications for NetApp data centers when on-prem data centers were still a cool thing. She then hopped onto cloud and big data world and landed at the data teams of Nike and Apple. There she worked mainly on batch processing workloads as a data engineer, built custom NLP models as an ML engineer and even touched upon MLOps a bit for model deployments. When she is not working with data, you can find her doing yoga or strolling the golden gate park and ocean beach.
Area of Expertise
Topics
Yell at the laptop, and it will show how you the right Ad: Build a voice-to-text screaming pipeline
Streaming pipelines have become more and more difficult to build and operationalize as complexity increases. In this talk, you will learn how Snowflake is simplifying streaming architecture to enable users to work with streaming data easily through Snowpipe Streaming and Dynamic Tables.
Snowpipe streaming is designed for rowsets with variable arrival frequency. Calling the streaming API prompts low-latency loads of streaming data rows using the Snowflake Ingest SDK and your own managed application code.
The session includes a live demo of the following:
- Record audio from the laptop's default microphone.
- Do speech to text transcription.
- Pipes the words from the transcript to Snowflake using Snowpipe Streaming.
- Create a Dynamic Table that aggregates words and their counts.
- Use an Ads table that refers to the aforementioned Dynamic Table to determine what ads to show to the user.
The talk title is longer:
Yell at the laptop, and it will show how you the right Ad: Build a voice-to-text screaming pipeline: Building a voice-to-text screaming pipeline using Snowpipe streaming and Dynamic Tables.
The Rise of a Full Stack Data Scientist: Powered by Python
As data scientists, we often rely on the data engineering teams upstream to deliver the right data needed to train ML models at scale. Deploying these ML models as a data application to downstream business users is constrained by one’s web development experience. Using Snowpark, you can build end to end data pipelines, train ML models and build data applications based on your ML model - all from scratch using Python only.
In this talk, you will learn to build a Streamlit data application to help visualize the ROI of different advertising spends of an example organization.
Setup Environment: Use stages and tables to ingest and organize raw data from S3 into Snowflake.
Data Engineering: Leverage Snowpark for Python DataFrames to perform data transformations such as group by, aggregate, pivot, and join to prep the data for downstream applications.
Data Pipelines: Use Snowflake Tasks to turn your data pipeline code into operational pipelines with integrated monitoring.
Machine Learning: Prepare data and run ML Training in Snowflake using Snowpark ML and deploy the model as a Snowpark User-Defined-Function (UDF).
Streamlit Application: Build an interactive application using Python (no web development experience required) to help visualize the ROI of different advertising spend budgets.
Target audience: Any data practitioner who wants to get started with data science. Or experienced data scientists looking to go full-stack. Or any python developers.
Current 2023: The Next Generation of Kafka Summit Sessionize Event
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top