Shipyard | Write automated tests with Claude Code using Playwright Agents

Learn how to use Playwright's specialized testing agents with Claude Code to write better automated tests. You can edit these in Markdown to customize them as you see fit.

Claude Code is tried and true for feature work, however there’s still some (understandable) doubt around using it to write tests. As of recently with its new subagents feature, you can get higher-quality automated tests, since subagents can act as subject matter experts in their specific domains.

Playwright now ships with three specialized agents (the planner, generator, and healer) that improve Claude Code’s baseline test engineering capabilities.

These subagents are built specifically for test automation expertise. They can explore your application on their own and even fix their own mistakes. Under the hood, they’re just defined in Markdown, so you can tweak them to fit your app’s needs.

Playwright’s testing agents

You can use Playwright’s agent trio in sequence to design, write, review, and correct automated tests. Each agent has a specific role in the testing pipeline in which it has its own expertise. Playwright agents communicate through structured artifacts, which helps them maintain context over time.

The Planner

The planner agent explores your application and produces a Markdown test plan. You can feed it a simple seed test and a request like “Generate a plan for the checkout flow,” and it’ll navigate your app on its own, finding user paths and edge cases you might have missed.

The planner works through scenarios like a QA engineer would (vs. clicking buttons randomly). The output is a Markdown file that’s precise enough for the generator agent to work off of, and clear enough for stakeholder review.

The Generator

The generator agent transforms Markdown plans into Playwright tests. Instead of direct “translation” of natural language to code, the agent actively interacts with your application to make sure every selector works and the assertions make sense.

The generator “knows” a solid amount about Playwright best practices. It uses semantic locators, implements proper waiting strategies, makes tests readable/easily maintainable. Tests align with specs wherever possible, making it easy to trace requirements through to implementation.

The Healer

When a test fails, the healer agent replays the failing steps, inspects the current UI to locate equivalent elements or flows, suggests a patch (e.g. a locator update or wait adjustment), and re-runs the test until it passes or guardrails stop the loop.

A couple examples:

Changed button text → healer finds the new selector
Added a loading spinner → healer adjusts the wait strategy

Instead of your app breaking with feature changes, the healer agent helps adapt your tests to them.

The best Claude Code subagents for testing

The best part of Playwright Agents is that they’re just custom Claude Code subagents, easy to edit/customize. Agent definitions are collections of instructions and MCP tools provided by Playwright. They’re just Markdown files in your project, which means you can:

Adjust the planner’s exploration strategy with sample user stories for your particular app
Customize the generator’s code style to fit naming conventions, comment patterns, etc.
Fine-tune the healer’s fix strategies based on common failure patterns

To get these agents set up with Claude Code, you can run:

npx playwright init-agents --loop=claude

Ideally, you might want to regenerate these definitions whenever Playwright updates to pick up new tools and instructions.

Add agentic testing to your agentic dev loop

With Playwright’s agents, you can get better Claude Code automated testing, thanks to Playwright’s smart prompts that actually understand QA engineer patterns. You can add them to your subagent workflow to get smarter, more custom test generation (vs. base Claude Code).

These subagents, along with the Playwright MCP server, are excellent, convenient ways to QA your software within the Claude Code loop.

Need somewhere to host those PRs, so you can visit them live + run your Playwright E2E tests against them? That’s where Shipyard comes in. Try it free for 30 days: build, test, and ship your software in half the time.