Load, Stress, and Benchmark Testing
Your API handles 500 requests per second in staging and everyone celebrates. Then Black Friday hits, traffic spikes to 3,000 rps, and the database connection pool exhausts within minutes. The response time graph looks like a hockey stick and your CEO is watching the uptime dashboard. Performance testing is not optional for production systems — and AI makes building comprehensive performance test suites dramatically easier.
What You’ll Walk Away With
Section titled “What You’ll Walk Away With”- k6 and Artillery load test generation from AI prompts
- Stress testing patterns that find your system’s breaking point safely
- Continuous benchmarking in CI that catches performance regressions
- AI-assisted analysis of performance bottlenecks from test results
- Realistic traffic pattern simulation for your specific use case
Load Test Generation
Section titled “Load Test Generation”Generate a k6 load test for our checkout API:
Scenario: Simulate a flash sale with ramping traffic- Ramp from 0 to 100 virtual users over 2 minutes- Hold at 100 VUs for 5 minutes (steady state)- Spike to 500 VUs for 1 minute (flash sale moment)- Return to 100 VUs for 2 minutes (recovery)- Ramp down to 0 over 1 minute
API calls per virtual user iteration:1. POST /api/auth/login (use test credentials from env)2. GET /api/products?category=sale (browse sale items)3. POST /api/cart/items (add random product)4. POST /api/checkout (complete purchase with test payment)
Thresholds:- p95 response time < 500ms during steady state- p99 response time < 2000ms during spike- Error rate < 1% at all times- Checkout success rate > 99%
Save to /tests/performance/checkout-load.k6.jsclaude "Create a comprehensive k6 performance test suite:
1. /tests/performance/checkout-load.k6.js - Checkout flow load test - Ramping traffic pattern: 0 -> 100 -> 500 -> 100 -> 0 VUs - Realistic user journey (login, browse, cart, checkout) - SLA thresholds for response time and error rate
2. /tests/performance/api-stress.k6.js - API endpoint stress test - Test each critical endpoint individually - Find the breaking point (ramp until errors > 5%) - Report max throughput per endpoint
3. /tests/performance/helpers/auth.js - Shared auth helper - Login and cache tokens - Token refresh handling
4. package.json scripts: - test:perf:load - Run load tests - test:perf:stress - Run stress tests - test:perf:smoke - Quick 30-second smoke test
Include realistic test data generation for each scenario."Create a performance testing suite for this project:1. Analyze the API routes to identify critical endpoints2. Generate k6 load tests for the top 5 most important flows3. Create stress tests that find breaking points4. Add performance smoke tests for CI integration5. Create a PR with the test suite and documentation
Include realistic traffic patterns based on typical SaaS usage.Stress Testing: Finding the Breaking Point
Section titled “Stress Testing: Finding the Breaking Point”Continuous Benchmarking in CI
Section titled “Continuous Benchmarking in CI”Catching Performance Regressions Automatically
Section titled “Catching Performance Regressions Automatically”Analyzing Performance Results with AI
Section titled “Analyzing Performance Results with AI”After running load tests, AI tools can help interpret the results.
Database Performance Testing
Section titled “Database Performance Testing”When This Breaks
Section titled “When This Breaks”“Load tests pass locally but the production system is slower.” Your local environment does not match production. Run performance tests against a staging environment that mirrors production infrastructure (same database size, same network latency, same connection limits). Never use local databases for load testing.
“Tests give inconsistent results between runs.” Performance tests are inherently noisy. Run each scenario three times and use the median. Establish acceptable variance bands (plus or minus 15%). Fail only when the median exceeds the threshold, not individual runs.
“We cannot run load tests in CI because they take too long.” Use a tiered approach: smoke tests (30 seconds) on every PR, load tests (10 minutes) nightly, full stress tests weekly. The smoke test catches the obvious regressions; the longer tests catch the subtle ones.
“The AI generated load tests that do not match real traffic patterns.” Give the AI your actual traffic data. Export a sample from your analytics: “Our traffic peaks at 2pm EST, 60% of requests are GET /api/products, and the average user session makes 12 API calls over 8 minutes.”