Can LLMs finally solve testing?

UI testing for Power Platform has always been painful - lack of tooling, a lot of upfront effort, constant maintenance and tests that break with every form change. This session shows a different approach: using LLMs to generate stable, maintainable test code from natural language scenarios.

Why UI-level tests over unit tests? When AI writes code, it writes tests that pass - adjusting both together until you get green checkmarks without real validation. User behavior simulation is the honest feedback loop: tests describe what users expect, not what the code does.

The workflow: an LLM with Playwright MCP explores your app, identifies testable scenarios, and generates Gherkin test plans. A GitHub Copilot agent converts those into Playwright code that runs without AI in your pipeline - deterministic, no inference costs. When tests fail, agent mode analyzes failures and proposes fixes.

Gherkin scenarios are implementation-agnostic - they describe behavior, not selectors. This makes them stable when UI changes, and lets them serve dual purposes: executable specifications that guide AI coding agents, and test definitions that validate the result. For new apps, you generate scenarios from specifications before any code exists - test-driven development with AI-generated tests as quality gates.

Tomas Prokop

Microsoft MVP / Power Platform Architect

Prague, Czechia

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Can LLMs finally solve testing?

Tomas Prokop

Links

Actions