Speaker

Shachar Azriel

Shachar Azriel

VP Product @ Baz

Tel Aviv, Israel

Actions

For the past decade, I’ve helped startups and mid-sized tech companies scale teams, establish systems, and launch products that stick.
Today, I’m VP of Product at Baz, where we’re on a mission to reinvent code review with AI: making it faster, smarter, and (believe it or not) more fun for developers.

We’re working at the bleeding edge of technology, facing unique challenges that many other product and development teams are only beginning to encounter. That’s why I regularly share real stories from building AI-powered features in the wild, and from the journey of building an AI-driven company itself.

Beyond the product, I love connecting with people. I co-founded the AI-Dev community in Israel, dedicated to accelerating innovation in AI coding and product development.

Area of Expertise

  • Information & Communications Technology

Topics

  • Product Management
  • AI & product management
  • DevTools
  • AI Agents

5 things I wish I hadn’t done building my AI agent

Most talks about AI agents focus on success stories and best-case outcomes. This talk is about what can actually go wrong when you ship an scale-up AI agent in a start-up.

Over the past 18 months, our team in Baz built and scaled an AI-powered Code Review Agent used daily by thousands of across the world.
To move fast in this crazy market, we made several architectural, product, and UX decisions that seemed reasonable at the time, but later turned into expensive mistakes. Some cost us users, and some hit our precious revenue.

In this session, I’ll share five concrete pitfalls we encountered while building a real AI coding agent, why they happened, how we detected them, and the pivots that ultimately worked.

This is not a theoretical talk: every example comes from a production system, and will include real system diagrams, usage data, and how the fixes changed behavior in production.(alongside a lot of self humor :)

1. We built a “smarter” agent, and it got "dumber"
Why adding more context, tools, and responsibilities reduced accuracy instead of improving it

2. We let users choose the model, and lost control of the results
How exposing LLM choice destroyed consistency and meaningful feedback

3. We optimized for an AI app, not for developer behavior
Why real adoption only starts when the agent lives where decisions were already being made (GH, GL or the IDE)

4. Our guardrails worked, until the providers changed the models
How silent model updates broke engineering assumptions and eroded user trust

5. Our metrics looked great, but users were still churning
Why industry-standard AI metrics (like accepted suggestions and time-to-merge) missed the signal that actually won (or lost) customers

Executable Specs: Building a Verification Layer for Agentic Coding

Agentic coding shifts the engineering bottleneck from implementation to verification. As agents generate more code, teams struggle to ensure that output aligns with product intent, design constraints, and cross-team contracts, especially at scale.

This session explores a spec-driven verification architecture for AI-native development. Instead of treating specifications as static documentation, we turn them into structured, machine-readable context that agents can consume, and be verified against.

We’ll go deep into three technical layers:

1. Spec Ingestion from Ticketing Systems
How to retrieve structured requirements from systems like Jira or Linear, normalize them into machine-readable artifacts, and bind them to pull requests and agent workflows. We’ll discuss parsing strategies, schema design, and avoiding ambiguity in loosely written tickets.

2. Design Verification via Figma MCP
How to inspect design constraints programmatically using Figma through MCP-based integrations. We’ll cover extracting spacing, typography, color tokens, and layout rules, and validating UI implementations against design intent, beyond visual regression testing.

3. Secure Execution with Isolated Sandboxes (AWS Agent Core as an example)
How to spin up ephemeral, permission-bounded sandboxes for runtime verification of agent output. We’ll examine isolation strategies, environment scoping, observability hooks, and how to safely execute verification logic without expanding blast radius.

Along the way, we’ll share failure modes and scaling lessons from real-world implementations (with a lot of self humor and users' reactions :) including what breaks when specs are vague, when design tokens drift, and when verification isn’t treated as a first-class concern.

This session brings a fresh, field-tested perspective from building and operating a spec-driven verification architecture in real engineering environments

WeAreDevelopers World Congress 2026 - Europe Sessionize Event Upcoming

July 2026 Berlin, Germany

AI Native DevCon London 2026 Sessionize Event Upcoming

June 2026 London, United Kingdom

Shachar Azriel

VP Product @ Baz

Tel Aviv, Israel

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top