Lisa N. Cao

Lisa N. Cao

Product Manager at Datastrato


Lisa is a data engineer and now product manager interested in observability, validation, and reliability in data systems. Through her work at Datastrato she is reinventing new and improved use cases for metadata to be leveraged in AI stacks for DataOps and Data Fabric integrations. Her background consists of a variety of start-ups, nonprofits, consulting firms, GovTech, and biotechnology. She is a Google Women TechMakers Ambassador, Linux Foundation LiFT recipient for Women in Open Source, founder and chair of the Vancouver Datajam, and lead maintainer of the BiocSwirl project.


DataOps for the Absolute Beginner

While building pipeline after pipeline- we might wonder, what comes next? Automation and Data Quality, of course! Here we will try to demystify data engineering's hardest problems when it comes to scaling: interoperability, standardization, and orchestration. This session discusses strategies and a complete beginner's roadmap for groups trying to implement their own DataOps infrastructures from scratch.

Managing Complexity in Multidimensional Data Architectures with Metadata

The rise of multidimensional data architectures has brought unprecedented power and complexity to data management. In this talk, we discuss practical ways and organizational strategies of setting up and using metadata and metadata fields can be harnessed to simplify, optimize, and streamline complex multidimensional, heterogenous, and multicloud data architectures within their organizations.

Introduction to Metadata Platforms for Machine Learning

As the landscape of machine learning evolves and becomes increasingly more intensive, the importance of robust metadata and its availability becomes paramount in managing data systems. In this talk, we discuss practical ways of using metadata to optimize machine learning workflows, enable data discoverability, and streamline data management for organizations at any size and on any stack.

Inclusive Datathons: Vancouver Datajam's Success Story

After organizing diversity focused community data science events for 8 years and running a datathon for 5, I have learned the many do's and dont's for creating a welcoming, inclusive, and productive experience for participants.

Join the Vancouer Datajam organizing team as we discuss the case study of the Vancouver Datajam and it's unapologetic approach to machine learning and data hacking that is both beginner friendly and advanced.

Open Source DataOps and MLOps Strategies

Here we will try to demystify data's hardest problems- interoperability, standardization, and vendor lock-in. From pipelines to serving models, this session discusses strategies for the promotion of open source technologies as groups try to implement their own DataOps and MLOps infrastructures.

Maintaining Diverse Maintainers: How to Keep Your Project Inclusive

After maintaining open source projects for 5+ years now with diverse teams, I've learn some key ways to keep your open source project inclusive. Whether it's the platforms you use, communication style, development flexibility, project promotion, or keeping contribution barrier low, there's lots of small strategies that can be used to increase representation and community connection.

The Quick and Dirty Guide to Metadata

Metadata- what is it? What are it's use cases? In this quick and dirty guide you'll learn about how metadata from various sources can be leveraged to better orchestrate and inform data management and practices, observability, and data governance-- essentials for any data-driven organization looking to scale. We will go over key examples of metadata such as information about your data's form and structure, catalog records, and generally any data about data and how to use it.

To Mesh, or Not to Mesh? How to Know When a Fabric is Good Enough

As big data has taken the world by storm, how we serve and maintain it's infrastructure has grown increasingly complex as well. How do we know what architecture is right for us? As incredible as mesh is, it takes a lot of investment and work to implement. In this lightning talk, we go over some intermediary data architectures that will help platformize your data serving without having to go too far into the deep end.

Metadata Lakes for Next-Gen AI/ML

As data catalogs evolve to meet the growing and new demands of high velocity, unstructured data, we see them taking new shape as an emergent and flexible way to activate metadata for multiple uses. This talk discusses modern uses of metadata at the infrastructure level for AI-enablement in RAG pipelines in response to the new demands of the ecosystem. We will also be discussing Apache (incubating) Gravitino and it's open source-first approach to data cataloging across multicloud and geo-distributed architectures.

Lisa N. Cao

Product Manager at Datastrato


Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top