11 Principles for Evaluating AI Dev Tools

Benchmarks measure narrow capabilities. Demos show best-case scenarios. Neither tells you whether AI-generated code will survive production or whether that shiny new tool deserves a place in your stack.

This talk presents a unified framework of 11 principles for evaluating AI-generated code and the tools that manage it. Because code quality and tool quality are inseparable: bad tools generate bad code, and bad code evaluation processes never catch it.

We'll reframe AI use with responsibility boundaries where the core questions shift from "is it fast?" to "can this be understood under pressure, safely changed, and defended to a stakeholder?"

Through real-world patterns from teams adopting AI across their SDLC, we'll apply these principles to distinguish tools that surface risk from tools that hide it.

Attendees will leave with a practical rubric to decide which AI tools to trust, which to constrain, and how to keep human judgment at the center of fast-moving, AI-augmented engineering.

Nnenna Ndukwe

Principal Developer Advocate at Qodo AI

Boston, Massachusetts, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

11 Principles for Evaluating AI Dev Tools

Nnenna Ndukwe

Links

Actions