Skip to content

Debugging Workflows

It is 2 AM. Your payment processing endpoint is returning intermittent 500 errors. The error logs show a “Cannot read property ‘amount’ of undefined” but the object is clearly defined three lines above. You have been staring at the code for 40 minutes. The bug is a race condition between two async operations, and Claude Code can find it in 90 seconds — if you give it the right context.

  • A systematic debugging workflow that works for any bug
  • Error trace analysis prompts that identify root causes
  • Bisection strategies for finding which commit introduced a bug
  • Log parsing patterns for production debugging
  1. Capture the error context Pipe the error directly to Claude Code:

    Terminal window
    cat error.log | claude -p "Analyze this error. What is the root cause and which file should I look at first?"
  2. Continue interactively for deeper investigation

    Terminal window
    claude -c

    Now you are in the same session with the error context loaded. Ask Claude to read the relevant files.

  3. Reproduce the issue

    Run the failing test: npx jest src/payments/__tests__/process.test.ts
    Show me the exact line where it fails and the state of all variables at that point.
  4. Identify the root cause

    ultrathink about why this fails intermittently. The error only happens under load.
    Read the async flow in processPayment() and identify any race conditions.
  5. Fix and verify

    Fix the race condition. Then run the full test suite to make sure nothing else broke.
Here is a stack trace from production:
[paste stack trace]
1. Identify the root cause (not just the symptom)
2. Trace the call chain from the error back to the original trigger
3. Read the source files involved and explain what went wrong
4. Suggest a fix that addresses the root cause, not just the symptom
This TypeScript error makes no sense to me:
[paste TypeScript error]
Read the file and its imports. Trace the type through every transformation
to find where the type mismatch actually originates. It might not be in the
file the error points to.
ultrathink about this intermittent failure:
The test passes 9 out of 10 times. The failure is:
[paste failure output]
This smells like a race condition. Read the async flow in the failing test
and the code it exercises. Identify:
1. Any shared mutable state
2. Any missing await calls
3. Any operations that assume sequential execution
4. Any cleanup that runs before async operations complete

When you know the bug did not exist in a previous version:

The /api/search endpoint was working correctly in commit abc123 (2 weeks ago)
but is broken in HEAD. Help me bisect:
1. Run: git log --oneline abc123..HEAD -- src/api/search/
2. Identify the most likely commit to have introduced the regression
3. Check out that commit and run the relevant test
4. If the test passes, the bug is in a later commit. If it fails, it is in this commit or earlier.
5. Narrow down to the exact commit using binary search.
Terminal window
# Pipe filtered logs to Claude Code
grep "ERROR\|WARN" /var/log/app.log | tail -100 | \
claude -p "Categorize these errors. Which are most frequent? Which are likely related to the payment processing bug we are investigating?"
Our API response times increased from 50ms to 800ms after the last deploy.
Run these diagnostics:
1. Check git diff HEAD~1 for changes to database queries
2. Look for any new N+1 query patterns in the changed files
3. Check if any new middleware was added to the request pipeline
4. Look for blocking I/O operations that could explain the latency
Focus on database query changes first -- that is the most common cause.

Claude cannot reproduce the bug: Some bugs only manifest in production environments. If Claude cannot reproduce it locally, provide production logs, environment details, and the specific data that triggers the issue. Consider using --append-system-prompt to give Claude production context.

Debug session runs out of context: Long debugging sessions accumulate a lot of file reads and test output. Use /compact Keep all error traces, test output, and diagnostic results to preserve the important context while freeing space.

Claude fixes the symptom, not the cause: This happens when you describe the symptom without the context. Instead of “fix the null pointer error,” say “the user object is null because the async fetch races with the render. Fix the race condition, not the null check.”

Bisection fails on flaky tests: If the test is intermittent, git bisect run will give incorrect results. Run the test multiple times at each bisection point: git bisect run bash -c 'for i in 1 2 3; do npx jest test.ts || exit 1; done'.