Speaker

Dash D

Dash D

Director of Platform and Technical Evangelism

San Francisco, California, United States

Actions

Dash Desai has 18+ years of hands-on software and data engineering background. With recent experience in Big Data, Data Science, and Machine Learning, Dash applies his technical skills to help build solutions that solve business problems and surface trends that shape markets in new ways.

Dash has worked for global enterprises and tech startups in agile environments as an engineer and a solutions architect. As a Platform Technical Evangelist, he is passionate about evaluating new ideas to help articulate how technology can address a given business problem. He also enjoys writing technical blog posts, hands-on tutorials, and conducting technical workshops.

Area of Expertise

  • Information & Communications Technology
  • Arts
  • Travel & Tourism

Model Experiments Tracking and Registration using MLflow on Databricks

Machine learning models are only as good as the quality of data and the size of datasets used to train the models. Data has shown that data scientists spend around 80% of their time on preparing and managing data for analysis and 57% of the data scientists regard cleaning and organizing data as the least enjoyable part of their work. This further validates the idea of MLOps and the need for collaboration between data scientists and data engineers.

During the crucial phase of data acquisition and preparation, data scientists identify what types of (trusted) datasets are needed to train models and work closely with data engineers to acquire data from viable data sources.

Another important aspect of the ML lifecycle is experimentation–where data scientists take sufficient subsets of (trusted) datasets and create several models in a rapid, iterative manner. And without proper industry standards, data scientists have to rely on manual tracking of models, inputs, hyperparameters, outputs and any other such artifacts throughout the model experimentation and development process.

In this talk, you learn how to automate these crucial tasks using StreamSets and MLflow on Databricks.

Low-Latency Inference Using TensorFlow Models In Dataflow Pipelines

The real value of a modern data platform is realized only when business users and applications are able to access raw and aggregated data from a range of sources, and produce data-driven insights in a timely manner. And with Machine Learning, analysts and data scientists can leverage historical data to help make better, data-driven business decisions—offline and in real-time using technologies such as TensorFlow.

In this talk, you will learn how to train a simple neural network TensorFlow model in Python and use it as a dataflow pipeline created in the open source StreamSets Data Collector. The dataflow pipeline will ingest breast cancer data and classify cancer conditions as being benign or malignant (using the trained and saved TF model) within a contained environment—without having to initiate HTTP or REST API calls to ML models served and exposed as web services.

Spark ETL To Derive Sales Insights on Azure HDInsight And Power BI

In this session, we will review how easy it is to set up an end-to-end ETL data pipeline that runs on StreamSets Transformer to perform extract, transform, and load (ETL) operations. The pipeline will run on Apache Spark for Azure HDInsight cluster to extract raw data and transform it (cleanse and curate) before storing it in multiple destinations for efficient downstream analysis. The pipeline will also leverage technologies like Azure Data Lake Storage Gen2 and Azure SQL database, and the curated data will be queried and visualized in Power BI.

DeveloperWeek 2021 Sessionize Event

February 2021 Oakland, California, United States

Dash D

Director of Platform and Technical Evangelism

San Francisco, California, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top