Speaker

Maxim Schepelin

Maxim Schepelin

Engineering leader at Booking.com

Amsterdam, The Netherlands

Actions

Maxim Schepelin has been an engineering leader in various industries, including Game Dev, e-commerce, and travel building teams capable of launching products from idea to world-scale. With over a decade of experience leading engineering teams, Maxim has mastered the art and science of management, as well as how to exit both Vim and Emacs.
Maxim continues exploring ways to build effective engineering teams that deliver value at a sustainable pace.

Area of Expertise

  • Business & Management
  • Information & Communications Technology
  • Media & Information
  • Physical & Life Sciences

Topics

  • Software Development
  • Leadership
  • Software Engineering Management
  • Management
  • Site Reliability Engineering (SRE)
  • Agile Methodologies
  • Software Delivery
  • Organizational Design
  • Team Management
  • Engineering Culture & Leadership
  • Software Engineering

How to set SLOs, drive improvements, and make friends with business stakeholders

Make reliability a shared priority, not just tech-speak. This session shows you how to frame SLOs in business terms, engage stakeholders, and use clear metrics to align technical and business priorities.

The outline of the talk:
- Engineers care about reliability, and we have developed a language to talk about it
- We count the nines (9s) of availability, define error budgets, measure MTTR and MTBF
- Why is it so hard then to convince our business counterparts about doing technical improvements?
- Because our language sounds intimidating and disconnected from reality.
- We fail to explain the actual value of reliability
- The key question of reliability:
- How much does it cost when your service is down for one hour?
- Don't ask how many nines of availability a service should have, ask how much cost is acceptable?
- SLO formula
- X must be true Y percentage of the time
- X is your definition of success
- Y is your threshold
- Two level of how you can measure success:
- Technical level: A service is running, DB is working, API returns a 200 status code.
- Business level: The business process is working. 99.9% of transfers are successful, 99% of reports are generated within 30 seconds, etc.
- Aim to define SLO on the business level.
- From measuring to prioritization
- Benefits of measuring SLO on the business level:
- You know the costs of outages
- You know the cost of bad architecture
- You know the cost of slow processes
- Your data points are facts from the pasts
- Business plans and new features are guess work about the future
- It's easier to talk about priorities when you numbers are solid.
- You're using the same units to compare tech improvements and features.
- Use error budgets to drive improvement
- Review how your systems perform against SLO.
- If your SLO is 99.9%, you allow yourself to fail in 0.01% of cases. This is your error budget.
- What do you do when you exceed the budget?
- Code freeze
- Prioritize immediate improvements to recover reliability.
- Conduct postmortems
- "Do better next time" is not a strategy.
- Make reliability a first-class citizen.
- Report SLOs together with business metrics.
- Remind your stakeholders that availability is your most important feature.

So, what your team is going to do in the next six months?

Avoid common objective-setting pitfalls. Learn proven techniques to define clear, impactful goals for the team, align stakeholders, and make impact.

Leading with Reliability: Applying SRE Principles to Build Stronger Engineering Organizations

Service Reliability Engineering (SRE) has long been the discipline responsible for keeping complex systems healthy, resilient, and predictable under pressure. But the real power of SRE lies not just in the tools, dashboards, or operational frameworks—it lies in its philosophy: focusing on what matters most, measuring the right things, and making intentional trade-offs.

As engineering leaders, we can apply these principles far beyond production environments. This talk explores how core SRE concepts can become high-leverage leadership tools for shaping team culture, guiding prioritization, and driving meaningful business outcomes.

We begin with service criticality, expanding the traditional technical lens to view the entire end-to-end customer journey. Instead of assessing components in isolation, we’ll explore how to map dependencies across teams and systems to surface the true bottlenecks and organizational weak points that impact users.

From there, we’ll look at Service-Level Indicators (SLIs) and reinterpret them at the business level. What does “reliability” mean when framed through customer expectations rather than CPU metrics? How can engineering leaders define measurable signals that reflect whether the product is delivering on its intended value?

Next, we’ll dig into Service-Level Objectives (SLOs)—not as uptime percentages, but as promises to customers. We'll discuss how leaders can craft SLOs that articulate what “good enough” looks like for the business, and how these objectives guide healthier conversations around trade-offs, investment, and risk.

Finally, we’ll explore error budgets as a strategic leadership mechanism. Error budgets offer a structured way to balance innovation and stability, negotiate between delivery teams and product, and make aligned decisions about when to push forward and when to fix foundational issues.

Attendees will leave with a toolkit for adopting SRE thinking at the organizational level—helping them connect engineering decisions to business impact, create a culture of reliability, and lead teams that deliver value with clarity and confidence.

Site Reliability Engineering offers powerful frameworks for managing system health—but these ideas don’t have to stay confined to production. This talk shows engineering leaders how to translate SRE concepts such as service criticality, SLIs/SLOs, and error budgets into organizational tools that improve decision-making, clarify priorities, and strengthen alignment between engineering and the business.

Maxim Schepelin

Engineering leader at Booking.com

Amsterdam, The Netherlands

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top