Most Active Speaker

Ana Margarita Medina

Ana Margarita Medina

Sr. Staff Developer Advocate

San Francisco, California, United States

Actions

Ana Margarita Medina is a Sr. Staff Developer Advocate, she speaks on all things SRE, DevOps, and Reliability. She is a self-taught engineer with over 14 years of experience, focusing on cloud infrastructure and reliability. She has been part of the Kubernetes Release Team since v1.25, serves on the Kubernetes Code of Conduct Committee, and is on the GC for CNCF's Keptn project, When time permits, she leads efforts to dispel the stigma surrounding mental health and bring more Black and Latinx folks into tech.

Awards

  • Most Active Speaker 2023

Area of Expertise

  • Information & Communications Technology

Topics

  • sre
  • SRE
  • DevOps
  • Software Development
  • Cloud & DevOps
  • DevOps & Automation
  • Software Deveopment
  • DevOps Transformation
  • DevOpsCulture
  • DevOps Skills
  • Development
  • Leadership development
  • Software Design
  • Software Engineering
  • Open Source Software
  • Software Architecture
  • Software testing
  • Modern Software Development
  • Distributed Software Systems
  • Cloud Native
  • Cloud Computing
  • Cloud Architecture
  • Cloud & Infrastructure
  • Google Cloud Paltform
  • Cloud Containers and Infrastructure
  • Web Development
  • AWS
  • Enterprise Software
  • Site Reliability Engineering
  • Reliability
  • AWS ECS
  • AWS DevOps
  • Cloud Native Infrastructure
  • Getting started with Site Reliability Engineering
  • Observability
  • Service Reliability Engineering
  • Site Reliability
  • Monitoring & Observability
  • Observability and performance
  • App observability
  • Monitoring
  • chaos engineering

Yes, Observability Landscape as Code is a Thing!

Abstract:
=======
I started my own Observability journey a little over a year ago, when I managed the Observability Practices team at Tucows/Wavelo. As part of my journey, I learned about Observability and OpenTelemetry, and dove into what it takes to achieve Observability greatness at an organization-wide scale. Part of that involved understanding my Observability Landscape, and how it can be codified to ensure consistency, maintainability, and reproducibility.

Summary:
========
Observability is about good practices. Good practices are useless unless you have a consistent landscape. In order to support these practices, there are a number of setup-type things that are required to enable teams to truly unlock Observability’s powers.

With so many integrations and moving parts, it can be hard to keep track of all the things that you need in order to achieve Observability greatness. This is where Observability-Landscape-as-Code (OLaC) can help. OLaC means supporting Observability by codifying your Observability landscape to ensure:
* Consistency
* Maintainability
* Reproducibility
* Reliability

The Observability Landscape is made up of the following:
* Application instrumentation
* Collecting and storing application telemetry
* An Observability back-end
* A set of meaningful SLOs
* An Incident Response system for alerting on-call Engineers

Observability-Landscape-as-Code makes this possible, through the following practices:
1. Instrumenting your code with OpenTelemetry
2. Codifying the deployment of the OTel Collector
3. Using a Terraform Provider to configure your Observability back-end
4. Codifying SLOs using the vendor-neutral OpenSLO specification
5. Using APIs to configure your Incident Management systems

This talk digs into the above practices that support OLaC as part of my personal Observability journey (see journey details below).

Chaos Engineering Bootcamp

Chaos engineering is the practice of conducting thoughtful, planned experiments designed to reveal weaknesses in our systems. This hands-on workshop will share how you can get started practicing Chaos Engineering. We will cover the tools and practices needed to implement this in your organization and discover how other companies using this practice to create reliable distributed systems.

During this workshop, attendees will be broken up into teams of four, and each assigned a role that is critical to the Chaos Engineering Experiment process. Folks will work together as a team to plan and execute various Chaos Engineering experiments with the guidance of the speakers.

Ana will provide cloud infrastructure, a demo environment, chaos + monitoring tool access, and printed material to design experiments. She will cover the foundations of chaos engineering, give folks time to have hands-on experience and then we will talk about how to break through on your practice and wins from the industry.

Building Islands and Reliability

We might have just spent the last year mastering the art of having the perfect flower field around our home, keeping the weeds out of our island, or just trying to build relationships with our neighbors. The skills and the lessons we’ve mastered through building our islands can also help us in real life, from staying connected to building stronger engineering teams and applications. Let’s take a moment to see the similarities of the work we’ve done on our islands, ourselves, workplaces, and celebrate all we’ve learned.

Don't Forget the Humans

We spend all day thinking about our technical systems, but we often neglect the needs of our human systems. Ana and Julie will walk attendees through the principles of system reliability and how to not only apply them to their systems but their personal life to prevent burnout and enjoy their weekends more.

In this talk, attendees will learn how to apply incident response and blameless practices into their everyday activities. Attendees will also walk away knowing how to build reliable socio-technical systems and some tips to apply them to the workplace.

OKRs with BLOs & SLOs via User Journeys

We hear it in commercials, in job interviews, and in the applications we use. “Users matter!” or “Customer experience is built into our culture and values!” But how are we proactively following what your organization is preaching?

Observability-as-Code in Practice with Terraform

Observability has quickly become a part of the foundation of modern SRE practices. Observability is about good practices, and like a good SRE would know, its codification is crucial to ensure consistency, maintainability, repeatability, and reliability. In order to support Observability as part of your SRE practice, there are a number of setup-type things that are required to enable teams to unlock Observability’s powers truly. This is where Observability-as-Code (OaC) can help.

The Evolution of GameDays

How does your team prepare for failure and learn from incidents? GameDays are a time to come together as a team and organization to explore failure and learn. This practice has been done across most industries, from fire departments to technology companies. Sometimes this has been unplugging data centers, table-top exercises, or chaos engineering experiments. All these ways have 1 thing in common: learning. In this session, I will look back at my SRE experience and how GameDays have evolved in other industries to share tips, so you can make your teams and companies more reliable.

Continuous Reliability. How?

As engineers we expect our systems and applications to be reliable. And we often test to ensure that at a small scale or in development. But when you scale up and your infrastructure footprint increases, the assumption that conditions will remain stable is wrong. Reliability at scale does not mean eliminating failure; failure is inevitable. How can we get ahead of these failures and ensure we do it in a continuous way?

One of the ways we can go about this is by implementing solutions like CNCF’s sandbox project Keptn. Keptn allows us to leverage the tooling we already use and implement pipelines where we execute chaos engineering experiments and performance testing while implementing SLOs. Ana will share how you can start simplifying cloud-native application delivery and operations with Keptn to ensure you deploy reliable applications to production.

5 lessons I’ve learned after falling down and getting back up

Over the last few years of working in the DevOps space, I’ve experienced a lot of failures and successes to get where I’m at. I’ve brought down multiple services I’ve worked on, under-provisioned resources, and even burned out. But situations like these allowed me to re-evaluate my engineering processes, implementations, and even work/life balance. Sometimes things need to break or fall apart before they can get better.

I’ll share my journey from self-taught software engineer to site reliability engineer to developer advocate. These ups and downs have constantly reminded me to rethink the ways things I get things done, so I can get back up and make my processes, and systems more reliable. Join me as I shared what I’ve learned on my journey, so it can help you on yours.

A Key to Success: Failure with Chaos Engineering

Chaos Engineering is thoughtful, planned experiments designed to reveal the weakness in our systems. Ana will discuss how performing Chaos Engineering experiments and celebrating failure helps engineers build muscle memory, spend more time building features and build more resilient complex systems.

Getting Started with Chaos Engineering

Chaos engineering is the practice of conducting thoughtful, planned experiments designed to reveal weaknesses in our systems. Chaos engineering can be thought of as the facilitation of experiments to uncover systemic weaknesses

This talk will introduce you to the practice of Chaos Engineering and explain how to get started practicing Chaos Engineering in your organization

. You will also learn how to plan your first Chaos Day. You've heard of Hack Days where you focus on feature development, Chaos Days are an opportunity to encourage your whole organization to focus on building more reliable systems.

Brief outline:
- An Introduction to Chaos Engineering
- A guide to getting started with Chaos Engineering in your organization
- Planning your first Chaos Day

2024 All Day DevOps Sessionize Event

October 2024

Open Source Summit North America 2024 Sessionize Event

April 2024 Seattle, Washington, United States

Maintainer Track + ContribFest: KubeCon + CloudNativeCon Europe 2024 Sessionize Event

March 2024 Paris, France

KubeCon + CloudNativeCon North America 2023 Sessionize Event

November 2023 Chicago, Illinois, United States

DevOpsDays Seattle 2023 Sessionize Event

August 2023 Seattle, Washington, United States

Devopsdays New York City 2023 Sessionize Event

June 2023

KubeHuddle Toronto 2023 Sessionize Event

May 2023 Toronto, Canada

SLOconf 2023 Sessionize Event

May 2023

DevOpsDays Austin 2023 Sessionize Event

May 2023 Austin, Texas, United States

2022 All Day DevOps Sessionize Event

November 2022

DevopsDays Detroit 2022 Sessionize Event

August 2022 Detroit, Michigan, United States

DevOpsDays Austin 2022 Sessionize Event

May 2022 Austin, Texas, United States

2021 All Day DevOps Sessionize Event

October 2021

Deserted Island DevOps 2021 Sessionize Event

April 2021

2020 All Day DevOps Sessionize Event

November 2020

RedisConf 2020 Takeaway (Online) Sessionize Event

May 2020 San Francisco, California, United States

DevOpsDays Austin 2020 Sessionize Event

May 2020 Austin, Texas, United States

Domain-Driven Design Europe 2019 Sessionize Event

January 2019 Amsterdam, The Netherlands

DevOpsDays KC 2018 Sessionize Event

October 2018 Kansas City, Missouri, United States

Tech Con 2018 Sessionize Event

September 2018 Detroit, Michigan, United States

Ana Margarita Medina

Sr. Staff Developer Advocate

San Francisco, California, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top