Session
Skill Issue: How to Write Skills That Actually Work
You've written a dozen skills. Some work, some don't, and you have no way to tell which. The agent says "you're absolutely right" while invoking the wrong one, and you keep re-explaining the same things to it. Without a way to measure what a skill adds, there's no way to find out. Meanwhile, the team next door is writing the same ones you already wrote, because nobody can find yours.
If you get why skills matter but can't get yours to do what you want, this is for you. 301, no "what is a skill," straight to the craft.
Skills are context artifacts: prose for flexible guidance, scripts for deterministic work, rules for hard constraints. Let's make them better:
1. A context artifact library that grows and patches itself: new skills emerge from agent friction, existing ones get fixed when they drift.
2. Design evals that grade only what the context adds — scenarios that probe the contribution, rubrics that ignore everything else, guards against state bleed and prompt leakage.
3. Pair every change with a second-model reviewer that catches regressions before merge. Version the library so rollback costs a checkout, not a postmortem.
4. Treat skills like code: scan, sign, gate at install. But also, treat them like prompts: add scanners for the attack surface conventional tooling can't see, like injection in the skill body, indirect poisoning through whatever the skill ingests, and tool-abuse paths that didn't exist before the agent had browsing tools.
5. Build a context artifact supply chain: registry, discovery, telemetry, staged rollout, so the team next door finds your skill instead of writing it for the third time. The same registry that solves discovery solves compliance: one push updates every agent in the org. Measure what proves reuse is real: installation and activation rates, because a skill nobody finds is a skill nobody uses.
Every practice above is itself a skill. "This is how we write context artifacts around here" belongs in the library: versioned, installed on every agent, graded by its own eval. The meta-skill has to earn its tokens too.
New talk for 2026. Scales across formats: 45-min session and 90-min hands-on stay pure 301 — attendees should already know what a skill is; 2- and 3-hour workshops expand the scope to 101→301, opening with the fundamentals (what skills are, how triggers work, installing and invoking them) before the 301 material.
Code-heavy, low-philosophy — targeted at engineers already using agents daily who want to stop repeating themselves. Solo talk.
Baruch Sadogursky
Member of DevRel Staff, Tessl AI
Nashville, Tennessee, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top