© Mapbox, © OpenStreetMap

Speaker

Lisa N. Cao

Lisa N. Cao

Product Manager at Datastrato

Actions

Lisa is a data engineer and now product manager interested in observability, validation, and reliability in data systems. Through her work at Datastrato she is reinventing new and improved use cases for metadata to be leveraged in AI stacks for DataOps and Data Fabric integrations. Her background consists of a variety of start-ups, nonprofits, consulting firms, GovTech, and biotechnology. She is a Google Women TechMakers Ambassador, Linux Foundation LiFT recipient for Women in Open Source, founder and chair of the Vancouver Datajam, and lead maintainer of the BiocSwirl project.

Awards

Area of Expertise

  • Business & Management
  • Information & Communications Technology

Why Open Source is Key to Future Data and AI Governance

As data and AI systems become increasingly central to enterprise and societal decision-making, governance challenges around transparency, compliance, and trust are more critical than ever. Open source plays a fundamental role in shaping the future of data and AI governance by fostering collaboration, auditability, and interoperability. This has resulted in various emerging open-source projects aiming to provide a unified metadata and governance layer for organizations to manage data assets across diverse platforms while ensuring compliance with evolving regulations. This talk explores how open-source solutions like Apache Gravitino empower enterprises to take control of their data and AI governance strategies, mitigate risks, and drive responsible AI adoption.

Finding product-market fit as an open source company

Does being an open source company make it easier, harder or just different to find product-market fit? What is the relationship between product-market fit and project-market fit? In this session, we'll go over some of the basics of product for engineering-driven startups and considerations for striving for PMF in the open source space. This session will also include an open discussion and case studies.

Open Source DataOps and MLOps Strategies

Here we will try to demystify data's hardest problems- interoperability, standardization, and vendor lock-in. From pipelines to serving models, this session discusses strategies for the promotion of open source technologies as groups try to implement their own DataOps and MLOps infrastructures.

Fundamentals of DataOps

* While building pipeline after pipeline- we might wonder, what comes next? Automation and Data Quality, of course! Organizations today are facing complex challenges in the end-to-end deployment of data applications, from initial development to operational maintenance. This process requires seamless integration of CI/CD practices, containerization, data infrastructure, MLOps, and security measures. This session discusses strategies and a complete beginner's roadmap for groups trying to implement their own DataOps infrastructures from scratch by empowering developers, architects, and decision-makers to effectively leverage open-source tools and frameworks for streamlined, secure, and scalable ML application deployments.

History and Future of Iceberg REST Catalogs

While Iceberg primarily concentrates on its role as an open data format for lakehouse implementation, it needs to heavily leverage its catalog for tracking tables and allowing external tools to interface with the metadata. In Iceberg 0.14.0, the community introduced the REST Open API Specification, but there is a good history into why it was developed and why the Iceberg community has decided not to provide it’s own service instead. In 2024 especially, we’ve seen many third party catalog service providers pop up instead, each with its own unique flavour- but realistically, what is the outcome we can expect from this widespread adoption? Together, we’ll review not only the history of the REST Catalog Spec, but the future of the many offshoot services it has sparked. Please note this talk is not a comparison of the catalog service providers, but instead the rationale on the Iceberg community to provide a spec and why everyone’s hedging their bets on Iceberg as the next standard.

Lisa N. Cao

Product Manager at Datastrato

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top