Speaker

Shawn Kyzer

Shawn Kyzer

Associate Director of Data Engineering @ AstraZeneca

Barcelona, Spain

Actions

Shawn is passionate about harnessing the power of data strategy, engineering, and analytics in order to help businesses uncover new opportunities. As an innovative technologist with over 15 years of experience, Shawn removes technology as a barrier and broadens the art of the possible for business and product leaders. His holistic view of technology and emphasis on developing and motivating strong engineering talent, with a focus on delivering outcomes whilst minimising outputs, is one of the characteristics which sets him apart from the crowd.

Shawn’s deep technical knowledge includes distributed computing, cloud architecture, data science, machine learning, and engineering analytics platforms. He has years of experience working as a consultant practitioner for a variety of prestigious clients ranging from secret clearance level government organizations to Fortune 500 companies.

Area of Expertise

  • Information & Communications Technology
  • Finance & Banking
  • Health & Medical
  • Energy & Basic Resources
  • Consumer Goods & Services

Topics

  • Data Science (AI/ML)
  • Data Strategy
  • LGBTQ in Technology
  • Data Analytics
  • Big Data Machine Learning AI and Analytics
  • Data Governance
  • Streaming Data Analytics
  • Machine Learning and Artificial Intelligence
  • Data Strategy & Leadership
  • MLOps
  • All things data
  • Data Mesh

Scaling Machine Learning with Data Mesh

With the quick rise in popularity of Data Mesh we now approach new frontiers in the Data Mesh space to solve for more complex scenarios such as model training at scale. This talk will discuss how to architect your Data Mesh platform to create scalable self service Machine Learning Data Products. Thereby allowing both Data Scientists and Machine Learning Engineers to easily provision and deploy infrastructure reducing time to market while also gaining all the benefits of Data Mesh.

I will focus on the common use case of anomaly detection in a closed-loop Convolutional Neural Network (CNN) to demonstrate the benefits of adopting the Data Mesh paradigm across a multi-plane data platform in Machine Learning operations. With this example we will learn how to make the leap from model experimentation to productisation while adhering to the common affordances of a data product such as observability, life-cycle management and discoverability.

Turbocharge your Data Analytics Plane with AI

In this two part workshop series we will step through how you can leverage AI in your current Data Analytics Plane. This is an interactive session and we expect that you will be following along as we go, but don’t worry we have git repos and notebooks at the ready. All you need to bring is your laptop and your favourite training data sets if you prefer not to use the ones we provide.

Part I : expert-system-gpt

Writing good documentation and finding answers to internally sourced questions is tough so let's create our own in-house expert to help us out. We will create our own expert system by leveraging the power of an open source foundation model GPT-NeoX. We will walk through the complete end to end process from the experimentation in notebooks to productionisation and finally deployment as an API or Gradio application that can be used by anyone internally in a secure fashion for any number of applications. Along the way you will also learn how GPT models work and therefore both their capabilities and limitations.

Session Repository: https://github.com/ShawnKyzer/expert-system-gpt

Part II : synthetic-data-generator

We will create a machine learning pipeline to generate time series and other types of datasets using GAN(Generative Adversarial Networks) and LSTM models. We will go from our initial experimentation notebooks to writing production ready ML pipelines that you can deploy in your own cloud environment for use by your teams. Once you are done you will not have to rely on using production data in development pipelines again!

Session Repository: https://github.com/ShawnKyzer/synthetic-data-generator

The Power of Interdisciplinary Teams: A Matter of Public Health

Often the most innovative ideas are born out of multidisciplinary collaborations in technology and other fields, increasingly public health. In this session we will discuss what this looks like in practice by leveraging an actual use case, “Exploring Social Media indicators to Predict COVID-19 Trends''. We begin with initial data ingestion then move to data exploration and model experimentation. Finally we will discuss the path to production and publication of the outcomes of the research.

We focus on four roles: Data Engineer, Machine Learning Engineer, Data Scientist and Public Health Domain Expert. In this interactive session you will be able to walk through the end-to-end process of how these roles interact with each other. We will cover how teams can work together and transition from messy data wrangling, to experimentation, to production and, finally, publication. We will demonstrate how you can use various open source tools to build data pipelines, co-analyze datasets and perform predictive analysis using machine learning in the context of the COVID-19 pandemic.

At the end of this session you will leave with a better understanding of public health pandemic analyses and a greater understanding of data flow. When we are done we hope to inspire and empower you to work on your own project bringing meaningful insights to the world with data!

Session Repository: https://github.com/ShawnKyzer/who-ears-social-listening

Shawn Kyzer

Associate Director of Data Engineering @ AstraZeneca

Barcelona, Spain

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top