Apr 21, 2026

To Skill or Not to Skill

Skills have become a default pattern in major agent frameworks. Claude Code comes with a /skills folder. Semantic Kernel composes agents from skills, prompts, and plugins. CrewAI, LangGraph, and others have their own versions. As of early 2026, more than 85,000 public agent skills exist across 27 platforms. In the middle of all that hype, maybe we should pause and think if skills are actually the definitive approach.

The field is still forming

The AI tooling space is moving fast and the right patterns haven’t converged yet. If you use multiple agents — Claude Code, Codex, Copilot, or whatever else ships next quarter — each one has its own way of organizing agent knowledge. In my experience, most of what passes for architecture at this stage is provisional: things that work well enough today but will look different in a year.

And that’s actually exciting — the conventions are still soft enough to shape. We get to write the best practices. First draft, sure, but first drafts are where the interesting decisions happen.

However, the fragmentation is getting expensive. The MCP spec defines three core primitives — resources, prompts, and tools — but not skills. There’s an experimental “Skills Over MCP” working group exploring how to distribute skills through the protocol, but it hasn’t produced a standard yet. Meanwhile, each framework has its own skill format. A skillset written for Claude Code is not directly usable in Codex.

What skills are

A skill is a playbook — a set of instructions written once so your agent doesn’t have to rediscover them every time, and you don’t have to reprompt them every time and get slightly different results.

That’s useful for sure, but that doesn’t mean that playbooks need to be promoted to first class primitives. Skills are defined differently across frameworks, there’s no universal format, and no universal way to distribute them.

Disclosure

When you start a fresh session, your agent first loads its system prompt. Then it loads the skill descriptions into context — each one costing roughly 100 tokens of metadata. It loads your config file — AGENTS.md, CLAUDE.md, or whichever your framework uses. It is only after that the conversation begins.

When it hits a task that matches a skill description, it goes and reads the full playbook. The description is the index and the playbook is the chapter.

What you’re looking at is progressive disclosure — the ability to load a summary first and retrieve the details when they’re needed. All agents support progressive disclosure. Skills are simply wrappers around it.

The cost of special status

Beyond the distribution problem, what happens to your thinking when skills get special status?

When a framework has a prominent /skills folder, it becomes the default answer to everything. Need to handle a new task? Write a skill. Agent producing inconsistent output? Write a better skill. Complex multi-step workflow? Chain three skills together. The /skills become the golden hammer, and every problem starts looking like a nail.

Meanwhile, approaches that would actually work better for the problem at hand get ignored. Research on few-shot prompting shows that for formatting, classification, and pattern-matching tasks, a handful of well-chosen examples in the prompt can steer model behavior more effectively than lengthy procedural instructions — though for complex reasoning tasks, the picture is different and more structured guidance helps. Sometimes the answer is a short script that does the job deterministically, instead of wasting tokens with a prompt that might get the math wrong. Sometimes a state machine controlling the flow through code is safer than letting an agent reason its way through a sequence.

The irony is that the more skills you load into an agent, the worse it actually performs. Research on skill selection shows that accuracy degrades non-linearly as library size grows, following a phase transition: performance stays high below a critical threshold, then drops sharply. A recent benchmark studying skill utility under realistic conditions found that performance gains from skills degrade consistently as settings become more challenging, with pass rates approaching no-skill baselines in the hardest scenarios.

Beyond playbooks

Step-by-step instructions for how to execute something are useful for consistency and reliability. But it’s one shape among many. An ontology teaches the agent how your domain works — relationships, terminology and logic — so the agent can extract the knowledge from the information. A golden set gives it examples of what good output looks like, so it matches your style and quality bar without needing to be told the rules. An anti-pattern catalog teaches it what to avoid, so it doesn’t repeat historical mistakes. A decision tree gives it diagnostic structure for faster problem-solving. Metacognitive instructions teach the agent how to regulate its own reasoning — when to stop, when to ask, when to doubt itself.

There are more, and the right ones depend on your project. When “skill” becomes the default primitive, all of these get flattened into playbooks. You end up writing step-by-step instructions for things that would be better expressed as examples, or cramming domain knowledge into a suboptimal procedural format. Promoting one pattern to a special status downgrades the rest.

Keep it simple

So what do these knowledge patterns actually need? Progressive disclosure and plain Markdown.

Write your playbooks, ontologies, golden sets — whatever pattern fits — as documentation. Write an index that describes what each file does. Point your agent to the index at boot time via whatever config file it supports. It can even be nested, even cyclic.

Your knowledge files work everywhere now. Claude, Codex, Copilot, or any other player. They can all read Markdown. No vendor lock-in.

And distribution gets simpler too. Since your knowledge is plain files, it travels through the same channels your code already uses. It can be either a shared Git repo referenced as a submodule across projects, an npm or pip package that bundles knowledge files alongside code, version-locked so they update together; or a shared folder in a monorepo. You can even serve them through MCP resources, which — despite their limitations — have one decisive advantage: they’re a standard, and they don’t require terminal access. Terminal access raises security considerations that matter in enterprise settings, and plenty of agents don’t support it.

What you lose

You do lose one thing: the /do-that-thing shortcut. Direct invocation by name, which is a real convenience.

However MCP prompts can give you the same thing. And if you still want skill files in a framework that supports them, write thin ones — three lines of config that point to the shared Markdown file. The skill becomes simple bookmark. The knowledge stays portable, and the substance lives in plain text where any agent can reach it.