Test Generation and Execution via CLI
Your PR is ready. The feature works, the code is clean, and your tech lead asks the question you were hoping to avoid: “Where are the tests?” You know you should have written them first. But TDD felt slow when you were in flow, and now writing tests after the fact feels like backfilling homework. The module has four dependencies that need mocking, the error paths are hard to trigger manually, and the existing test files use patterns you have not fully internalized yet.
Claude Code eliminates the friction. Point it at the code, tell it to match the existing test patterns, and it generates a test suite that covers the happy path, the error paths, and the edge cases you forgot about. Then it runs the tests after every change so you always know what is broken.
What You’ll Walk Away With
Section titled “What You’ll Walk Away With”- A workflow for generating tests that match your project’s existing patterns
- The red-green-refactor loop powered by Claude running tests after every edit
- Prompts for coverage gap detection and targeted test generation
- Headless mode patterns for automated test generation in CI
Generating Tests That Match Your Patterns
Section titled “Generating Tests That Match Your Patterns”The biggest problem with AI-generated tests is style mismatch. Claude’s default test style might not match yours. The fix is simple: show it your existing tests before asking it to write new ones.
Claude reads your existing tests, identifies the patterns (Jest vs Vitest, factory functions vs inline mocks, flat describe blocks vs nested describes), and generates new tests that look like they belong in the project.
The Red-Green-Refactor Loop
Section titled “The Red-Green-Refactor Loop”TDD with Claude Code is faster than traditional TDD because Claude handles the boilerplate while you focus on what the code should do.
-
Describe the behavior you want to implement
I need a function calculateShippingCost that takes an orderobject and returns the shipping cost. Rules:- Orders over $100 ship free- Standard shipping is $5.99- Express shipping is $14.99- International orders add a $10 surcharge- Weight over 50lbs adds $0.50 per additional pound -
Have Claude write the tests first
Write tests for calculateShippingCost based on those rules.Follow the patterns in tests/utils/pricing.test.ts.Include edge cases: exactly $100 order, exactly 50lbs,zero-weight digital goods, negative amounts (should throw).Do NOT write the implementation yet. -
Run the tests to confirm they fail
Run the tests. They should all fail since the functiondoesn't exist yet. Show me the output. -
Implement to make the tests pass
Now implement calculateShippingCost in src/utils/pricing.ts.Make all the tests pass. Use the simplest implementationthat satisfies the tests. -
Run tests again to confirm green
Run the tests. Show me which pass and which fail.Fix any failures. -
Refactor with the safety net in place
The implementation works but has some duplication. Refactor itfor clarity. Run tests after every change to make sure nothingbreaks.
Finding and Filling Coverage Gaps
Section titled “Finding and Filling Coverage Gaps”Most projects have test coverage that is unevenly distributed. Critical business logic might have 30% coverage while a utility function is at 100%. Claude can find and fill these gaps.
Targeted coverage improvement
Section titled “Targeted coverage improvement”When you need to improve coverage for a specific module:
Read src/services/payment.service.ts and the coverage reportfor it. Show me which lines and branches are not covered.
Then write tests that cover the uncovered paths. Focus on:- Error handling branches (catch blocks, validation failures)- Conditional logic that is only tested for the true case- Async paths where the promise rejectsAutomated Test Generation with Hooks
Section titled “Automated Test Generation with Hooks”Set up hooks so Claude automatically verifies tests after every change:
{ "hooks": { "PostToolUse": [ { "matcher": "Edit|Write", "command": "npm test -- --run --reporter=verbose 2>&1 | tail -20" } ] }}With this hook, every file edit triggers a test run. Claude sees the test output immediately and can fix issues without you asking.
Testing Different Layers
Section titled “Testing Different Layers”Unit tests for services
Section titled “Unit tests for services”Write unit tests for src/services/order.service.ts.
Mock all dependencies (database, email service, payment gateway).Use our existing mock factory in tests/utils/mocks.ts.
Test each public method with:- Valid input (happy path)- Invalid input (validation errors)- Dependency failure (database down, payment rejected)- Edge cases (empty arrays, null values, boundary numbers)Integration tests for API routes
Section titled “Integration tests for API routes”Write integration tests for src/routes/order.routes.ts.Use a test database (our test config creates an in-memory SQLite).
Test each endpoint:- POST /orders - successful creation, validation errors, auth required- GET /orders/:id - found, not found, wrong user (403)- PATCH /orders/:id/status - valid transitions, invalid transitions- DELETE /orders/:id - owner can delete, non-owner gets 403
Use supertest and follow the patterns in tests/routes/user.routes.test.ts.End-to-end tests
Section titled “End-to-end tests”Write an E2E test that covers the complete order flow:1. Create a user and log in2. Add items to cart3. Submit an order4. Verify the order appears in the user's order list5. Verify the inventory was decremented6. Verify the confirmation email was queued
Use Playwright and follow our existing E2E patterns intests/e2e/auth-flow.test.ts.Headless Test Generation
Section titled “Headless Test Generation”For automated coverage improvement in CI, use headless mode:
# Generate tests for files that changed in this PRgit diff --name-only main...HEAD -- 'src/**/*.ts' | \ while read file; do test_file="${file/src/tests}" test_file="${test_file/.ts/.test.ts}" if [ ! -f "$test_file" ]; then claude -p "Read $file and generate comprehensive tests. Follow the patterns in our existing test files. Save to $test_file." --output-format json fi doneFor a more targeted approach:
# Generate tests for uncovered codeclaude -p "Run npm test -- --coverage --json. Find all sourcefiles with less than 80% line coverage. For each one, generatetests that bring coverage above 80%. Match existing test patterns.Run the tests to verify they pass." --output-format jsonTest Data and Fixtures
Section titled “Test Data and Fixtures”Good tests need good test data. Have Claude generate fixtures that match your schema:
Read our Prisma schema and the existing fixtures in tests/fixtures/.Generate test fixtures for the organization module:
1. A factory function createTestOrganization() that generates a valid organization with sensible defaults2. A factory function createTestOrgMember() that creates a user with membership in an organization3. Edge case fixtures: org with maximum members, org with expired trial, org with special characters in the name
Follow the same factory pattern as createTestUser() intests/fixtures/user.fixtures.ts.When This Breaks
Section titled “When This Breaks”Generated tests are too shallow. Claude sometimes generates tests that only cover the happy path. Be explicit: “Include at least one test for each error path in the source code. Count the catch blocks and if statements — each one needs a test.”
Tests are brittle and break on minor changes. The tests assert too many implementation details. Tell Claude: “Test behavior, not implementation. Assert on return values and side effects, not on internal method calls or the number of times a function was invoked.”
Mocks are set up incorrectly. This happens when Claude does not read your existing mock patterns. Always point it at an existing test file with working mocks before generating new tests: “Read the mock setup in tests/services/user.service.test.ts. Use the exact same approach for mocking the database and external services.”
Tests pass individually but fail when run together. Shared mutable state between tests. Tell Claude: “Each test must be independent. Check for shared variables that are modified in tests. Use beforeEach to reset state. If using a test database, isolate with transactions that roll back.”
Coverage report is misleading. High line coverage does not mean good tests. After generating tests for coverage, review them: “For each test you generated, explain what behavior it verifies. If a test only exercises a line without asserting meaningful behavior, rewrite it.”
What’s Next
Section titled “What’s Next”Your test suite is comprehensive and your coverage gaps are filled. When those tests catch a bug, you need a systematic debugging workflow to trace the root cause.