TL;DR

Anthropic says its engineering teams have run hundreds of Claude Code Skills and found that the most useful ones package instructions, scripts, references and guardrails into reusable folders. The company says verification Skills had the largest effect on output quality, though best practices are still developing.

Anthropic says its engineering teams have used hundreds of Claude Code Skills to turn repeated AI-agent instructions into reusable, versioned folders that can include scripts, references, templates and guardrails, a shift the company says can make agent work more consistent and easier to improve over time.

The development was described in “Lessons from building Claude Code: How we use skills”, a June 3, 2026 Claude blog post by Thariq Shihipar, a Claude Code engineer. According to the write-up, a Skill is not just a saved prompt in Markdown; it is a folder an agent can discover, read and run.

Anthropic says a Skill can include a root SKILL.md file, deeper reference material, executable scripts, reusable assets, configuration files, hooks and memory. The company’s model is based on progressive disclosure: the agent reads the root instructions first, then pulls in more detailed material only when the task calls for it.

After cataloging its internal Skills, Anthropic said they fell into nine categories, including API references, product verification, data analysis, business-process automation, scaffolding, code review, CI/CD, runbooks and infrastructure operations. Anthropic said verification Skills, which check work rather than produce it, had the largest measured effect on output quality.

At a glance

reportWhen: Anthropic blog published June 3, 2026;…

The developmentAnthropic published lessons from using hundreds of Claude Code Skills across its engineering organization, arguing that Skills work as reusable folders rather than saved prompts.

AI Dispatch · Insights · 1 July 2026

A Skill is a folder, not a prompt

Anthropic published what it learned running hundreds of Skills across its own engineering org. Read as a business memo, the point is bigger than a coding trick: this is how ad-hoc prompting becomes durable institutional capability — the SOPs your agents actually follow, versioned and shared.

✕ The misconception

“A Skill is just a clever markdown prompt you save in a file.”

✓ What it actually is

A folder the agent can discover, read & run — instructions, scripts, references, templates, config & on-demand hooks.

Anatomy of a Skill — the file system is context engineering

my-skill/the unit you share & version

├─ SKILL.mdroot instructions + a description written for the model (its trigger)

├─ references/deep detail pulled in only when needed — progressive disclosure

├─ scripts/real code, so the agent composes instead of rebuilding boilerplate

├─ assets/templates & files to copy into the output

├─ config.jsonsetup the agent asks for if it’s missing (e.g. which Slack channel)

└─ hooks + memoryon-demand guardrails + an append-only log so it remembers

Why it matters: the folder itself is the knowledge base. The agent reads the root, then reaches deeper only when the task demands it — the same way you’d hand a new hire a one-pager that points to the detailed docs.

The nine types — a gap-analysis map for your own library

1Library / API reference

2Product verification ★ top impact

3Data fetching & analysis

4Business-process automation

5Code scaffolding & templates

6Code quality & review

7CI/CD & deployment

8Runbooks

9Infrastructure operations

By Anthropic’s own measurement, verification Skills — the ones that check the work — moved output quality the most. If you build one category well, build that one.

The craft — what separates a good Skill from a useless one

Gotchas = highest-signal section Describe for the model, not humans (it’s the trigger) Don’t state the obvious Ship scripts, not just prose On-demand guardrail hooks (/careful, /freeze) Let it remember (log / SQLite) Don’t railroad — leave room to adapt

The take

The knowledge of how your organization actually operates can be captured, versioned, shared & executed — and the thing capturing it is a humble folder with a script and a gotchas list inside. For the builder, that’s context engineering with real tools attached. For whoever owns the budget, it’s the difference between AI that starts from zero every morning and an asset that compounds. Caveats: best practices are still evolving, checked-in Skills cost context, and curation beats accumulation. Start with one Skill, one gotcha, and the category that catches your mistakes.

Source: “Lessons from building Claude Code: How we use skills,” Thariq Shihipar (Anthropic), Claude blog, 3 June 2026. Categories, examples & measured claims are Anthropic’s; framing is the author’s. Docs: code.claude.com/docs/en/skills.

thorstenmeyerai.com

Reusable Instructions Become Team Assets

The report matters because it frames AI-agent guidance as operational infrastructure, not a set of one-off prompts. If the approach works as Anthropic describes, teams can package the way they review code, test products, deploy services or handle incidents into shared folders that agents can apply repeatedly.

That could affect how companies manage institutional knowledge. Instead of relying on private prompt habits, scattered wiki pages or repeated manual instructions, a team could maintain a versioned Skills library that changes as edge cases appear. Anthropic’s claim is that these units can become compounding assets, though the amount of maintenance required will vary by team.

Python Automation for Complete Beginners: A hands-on guide to automating real work, even if you've never written a line of code

View Latest Price

As an affiliate, we earn on qualifying purchases.

Claude Code Skills Take Shape

Skills are part of the broader push to make coding agents more reliable in real engineering settings. The source material contrasts two approaches: repeatedly telling an agent how to behave each day, or capturing that knowledge once in a reusable package that can be shared and updated.

The Thorsten Meyer AI Dispatch, published July 1, 2026, interprets Anthropic’s post as a business memo as much as a developer guide. Its central reading is that Skills can function like standard operating procedures for agents, with documents, templates and code stored in the file system rather than buried in conversation history.

The most practical guidance in the source material is narrow: start with one Skill, include at least one hard-won gotcha, and give priority to the category that catches mistakes. Based on Anthropic’s reported measurement, that points many teams first toward verification workflows.

“Lessons from building Claude Code: How we use skills”
— Thariq Shihipar, Anthropic

Skill Quality Still Varies

Several details remain open. Anthropic has not provided, in the supplied material, a full public breakdown of the hundreds of internal Skills, the exact measurement method behind its quality claims, or how results differed across teams and task types.

It is also unclear how easily the approach transfers to organizations with weaker documentation habits, strict security rules or fast-changing internal systems. The source material cautions that best practices are still developing, that checked-in Skills can add context cost, and that curation matters more than accumulation.

Teams Test Smaller Libraries

The next step for readers is likely experimentation rather than wholesale adoption. Anthropic’s advice, as summarized in the source material, is to begin with one narrow Skill, add real scripts where possible, write triggers for the model rather than humans, and record known failure cases.

For engineering leaders, the near-term test is whether a Skills library reduces repeated instruction, improves review or verification quality, and stays maintainable as it grows. The clearest milestone will be whether teams can show measurable gains from specific Skills, especially in product verification and code-quality workflows.

Key Questions

What did Anthropic publish?

Anthropic published a June 3, 2026 Claude blog post by Thariq Shihipar describing how its engineering teams use Skills in Claude Code.

What is a Claude Code Skill?

According to Anthropic’s description, a Skill is a folder that can contain instructions, scripts, references, templates, configuration and hooks that an agent can read or run.

Which Skills had the biggest reported effect?

Anthropic’s reported measurement, as summarized in the source material, found that verification Skills had the largest effect on output quality.

Is this only useful for developers?

The immediate example is AI coding agents, but the broader claim is about capturing repeatable work processes as shared, versioned assets.

What remains unproven?

The supplied material does not show the full dataset, measurement method or outside replication. It remains unclear how well large Skill libraries will scale across different organizations.

Source: Thorsten Meyer AI

A Skill Is a Folder, Not a Prompt: What Anthropic Learned Running Hundreds of Them

Up next