Skip to content

Test Generation and Execution via CLI

Your PR is ready. The feature works, the code is clean, and your tech lead asks the question you were hoping to avoid: “Where are the tests?” You know you should have written them first. But TDD felt slow when you were in flow, and now writing tests after the fact feels like backfilling homework. The module has four dependencies that need mocking, the error paths are hard to trigger manually, and the existing test files use patterns you have not fully internalized yet.

Claude Code eliminates the friction. Point it at the code, tell it to match the existing test patterns, and it generates a test suite that covers the happy path, the error paths, and the edge cases you forgot about. Then it runs the tests after every change so you always know what is broken.

  • A workflow for generating tests that match your project’s existing patterns
  • The red-green-refactor loop powered by Claude running tests after every edit
  • Prompts for coverage gap detection and targeted test generation
  • Headless mode patterns for automated test generation in CI

The biggest problem with AI-generated tests is style mismatch. Claude’s default test style might not match yours. The fix is simple: show it your existing tests before asking it to write new ones.

Claude reads your existing tests, identifies the patterns (Jest vs Vitest, factory functions vs inline mocks, flat describe blocks vs nested describes), and generates new tests that look like they belong in the project.

TDD with Claude Code is faster than traditional TDD because Claude handles the boilerplate while you focus on what the code should do.

  1. Describe the behavior you want to implement

    I need a function calculateShippingCost that takes an order
    object and returns the shipping cost. Rules:
    - Orders over $100 ship free
    - Standard shipping is $5.99
    - Express shipping is $14.99
    - International orders add a $10 surcharge
    - Weight over 50lbs adds $0.50 per additional pound
  2. Have Claude write the tests first

    Write tests for calculateShippingCost based on those rules.
    Follow the patterns in tests/utils/pricing.test.ts.
    Include edge cases: exactly $100 order, exactly 50lbs,
    zero-weight digital goods, negative amounts (should throw).
    Do NOT write the implementation yet.
  3. Run the tests to confirm they fail

    Run the tests. They should all fail since the function
    doesn't exist yet. Show me the output.
  4. Implement to make the tests pass

    Now implement calculateShippingCost in src/utils/pricing.ts.
    Make all the tests pass. Use the simplest implementation
    that satisfies the tests.
  5. Run tests again to confirm green

    Run the tests. Show me which pass and which fail.
    Fix any failures.
  6. Refactor with the safety net in place

    The implementation works but has some duplication. Refactor it
    for clarity. Run tests after every change to make sure nothing
    breaks.

Most projects have test coverage that is unevenly distributed. Critical business logic might have 30% coverage while a utility function is at 100%. Claude can find and fill these gaps.

When you need to improve coverage for a specific module:

Read src/services/payment.service.ts and the coverage report
for it. Show me which lines and branches are not covered.
Then write tests that cover the uncovered paths. Focus on:
- Error handling branches (catch blocks, validation failures)
- Conditional logic that is only tested for the true case
- Async paths where the promise rejects

Set up hooks so Claude automatically verifies tests after every change:

.claude/settings.json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"command": "npm test -- --run --reporter=verbose 2>&1 | tail -20"
}
]
}
}

With this hook, every file edit triggers a test run. Claude sees the test output immediately and can fix issues without you asking.

Write unit tests for src/services/order.service.ts.
Mock all dependencies (database, email service, payment gateway).
Use our existing mock factory in tests/utils/mocks.ts.
Test each public method with:
- Valid input (happy path)
- Invalid input (validation errors)
- Dependency failure (database down, payment rejected)
- Edge cases (empty arrays, null values, boundary numbers)
Write integration tests for src/routes/order.routes.ts.
Use a test database (our test config creates an in-memory SQLite).
Test each endpoint:
- POST /orders - successful creation, validation errors, auth required
- GET /orders/:id - found, not found, wrong user (403)
- PATCH /orders/:id/status - valid transitions, invalid transitions
- DELETE /orders/:id - owner can delete, non-owner gets 403
Use supertest and follow the patterns in tests/routes/user.routes.test.ts.
Write an E2E test that covers the complete order flow:
1. Create a user and log in
2. Add items to cart
3. Submit an order
4. Verify the order appears in the user's order list
5. Verify the inventory was decremented
6. Verify the confirmation email was queued
Use Playwright and follow our existing E2E patterns in
tests/e2e/auth-flow.test.ts.

For automated coverage improvement in CI, use headless mode:

Terminal window
# Generate tests for files that changed in this PR
git diff --name-only main...HEAD -- 'src/**/*.ts' | \
while read file; do
test_file="${file/src/tests}"
test_file="${test_file/.ts/.test.ts}"
if [ ! -f "$test_file" ]; then
claude -p "Read $file and generate comprehensive tests.
Follow the patterns in our existing test files.
Save to $test_file." --output-format json
fi
done

For a more targeted approach:

Terminal window
# Generate tests for uncovered code
claude -p "Run npm test -- --coverage --json. Find all source
files with less than 80% line coverage. For each one, generate
tests that bring coverage above 80%. Match existing test patterns.
Run the tests to verify they pass." --output-format json

Good tests need good test data. Have Claude generate fixtures that match your schema:

Read our Prisma schema and the existing fixtures in tests/fixtures/.
Generate test fixtures for the organization module:
1. A factory function createTestOrganization() that generates
a valid organization with sensible defaults
2. A factory function createTestOrgMember() that creates a user
with membership in an organization
3. Edge case fixtures: org with maximum members, org with expired
trial, org with special characters in the name
Follow the same factory pattern as createTestUser() in
tests/fixtures/user.fixtures.ts.

Generated tests are too shallow. Claude sometimes generates tests that only cover the happy path. Be explicit: “Include at least one test for each error path in the source code. Count the catch blocks and if statements — each one needs a test.”

Tests are brittle and break on minor changes. The tests assert too many implementation details. Tell Claude: “Test behavior, not implementation. Assert on return values and side effects, not on internal method calls or the number of times a function was invoked.”

Mocks are set up incorrectly. This happens when Claude does not read your existing mock patterns. Always point it at an existing test file with working mocks before generating new tests: “Read the mock setup in tests/services/user.service.test.ts. Use the exact same approach for mocking the database and external services.”

Tests pass individually but fail when run together. Shared mutable state between tests. Tell Claude: “Each test must be independent. Check for shared variables that are modified in tests. Use beforeEach to reset state. If using a test database, isolate with transactions that roll back.”

Coverage report is misleading. High line coverage does not mean good tests. After generating tests for coverage, review them: “For each test you generated, explain what behavior it verifies. If a test only exercises a line without asserting meaningful behavior, rewrite it.”

Your test suite is comprehensive and your coverage gaps are filled. When those tests catch a bug, you need a systematic debugging workflow to trace the root cause.