Why are performance regressions so hard to catch?

Three reasons: (1) most CI environments do not have production-representative load, so the regression does not show up pre-merge, (2) performance is multi-dimensional (latency, throughput, memory, cold-start) and a regression on one dimension can be invisible if testing focuses on another, (3) performance regressions often emerge from interaction effects between independently-correct changes, which means no single PR review would catch the problem.

Performance-Regression Rework Cost: When the Build Slows Down

Q: What does it cost to fix a performance regression?

The labour range is wide: a single-line N+1 query fix can be 1 to 2 hours; an architectural regression that requires re-thinking caching strategy or data model can be 4 to 12 engineering weeks. The median industry labour cost per material performance regression sits at 30 to 80 engineering hours, with the long tail driven by regressions that require architectural change.

Q: How much revenue does a 100ms latency increase cost?

The widely-cited Amazon and Akamai studies place the revenue impact of an extra 100ms of latency at roughly 1% of conversion in e-commerce contexts. Google's 2017 mobile speed research found that bounce rate rises 32% as page load time goes from 1 to 3 seconds. The revenue impact varies by industry: e-commerce and adtech are most sensitive, B2B SaaS less so, internal-tool latency essentially unrelated to revenue.

Q: What does continuous performance testing cost?

A dedicated continuous performance testing setup typically costs 4 to 8 engineering weeks for initial build (load test infrastructure, baseline measurement, alerting), 1 to 2% of ongoing engineering capacity in maintenance, and $20K to $80K per year in compute and tooling. The avoided rework typically lands at 3 to 8% of total engineering capacity for organisations with material performance-sensitive workloads. Payback is typically 12 to 24 months.

Q: What is the most common performance regression class?

Across the Digital Signet portfolio and the wider performance-engineering literature, the dominant class is the N+1 query (a database access pattern that issues one query per item in a collection, rather than batching). It accounts for an estimated 35 to 50% of all material performance regressions in web and API workloads. The second-largest class is uncached repeated computation (re-running expensive work that was cached in a previous version).

Updated June 2026. Sources: Akamai 2017 Online Retail Performance Report; Google 2017 mobile speed research; Amazon historical latency studies; portfolio operational data.

The headline numbers

Median labour per material performance regression: 30 to 80 engineering hours
Revenue impact of 100ms latency increase (e-commerce, Akamai/Amazon): ~1% conversion drop
Mobile bounce-rate increase, 1s to 3s page load (Google 2017): 32%
Continuous performance testing payback: 12 to 24 months

The revenue-per-millisecond math

The canonical references for the revenue impact of latency are now over a decade old but have held up well in subsequent independent measurement. Akamai's 2017 Online Retail Performance Report found that a 100ms delay in page load time hurt conversion rates by 7% in their dataset; Amazon's long-running internal research has consistently reported each additional 100ms of latency costing them roughly 1% in sales. Google's 2017 mobile speed research found that bounce rate rises 32% as a page goes from 1 second to 3 seconds of load time, and rises another 90% going from 1 second to 5 seconds.

The revenue impact varies sharply by industry. E-commerce and adtech are the most sensitive, with the cited 1% per 100ms figure being a reasonable working baseline. B2B SaaS is meaningfully less sensitive (most users have already committed to using the tool and will tolerate slower responses up to a point). Internal-tool latency is essentially unrelated to revenue, although it can have meaningful productivity costs that show up elsewhere in the engineering budget. For a typical $10M ARR e-commerce site, a 300ms regression carried for a quarter could cost roughly $75K in lost revenue, often dwarfing the engineering labour cost of fixing it.

The five common regression patterns

N+1 query (35 to 50% of cases). The dominant class. A database access pattern that issues one query per item in a collection instead of batching. Often introduced by ORM convenience methods or by adding a new field to a list view without checking the query shape. Single-line to small-PR fix; the challenge is catching it before production.
Uncached repeated computation. Code that used to cache a result no longer does, often because a refactor moved the call site outside the cached path. Symptoms include CPU saturation under load that did not exist in the previous version, and memory pressure from repeated allocations.
Synchronous external call inserted into a hot path. A new feature adds a synchronous call to an external service (analytics, feature flag service, third-party API) inside a request flow. The external service has its own latency variability, which now bounds the request latency. Fix is usually to move the call out of the hot path (async, batched, or removed).
Index missing on a new query pattern. A new feature introduces a query pattern that the existing indexes do not support, and the database falls back to a full scan. Often invisible until production load exposes it; fix is usually a one-line index addition but can require migration coordination.
Bundle-size regression on the client. A new dependency or a wholesale import where a tree-shake-friendly named import would do, increasing JavaScript bundle size and time-to-interactive. Most common in front-end work; usually discoverable by a bundle-size budget check in CI.

The detection ladder

Performance regressions are uniquely hard to catch before production because most CI environments lack production-representative load. The investments that move the catch point upstream, in order of cost-effectiveness:

Per-PR bundle-size and basic micro-benchmark checks. Low cost (typically 1 to 2 weeks setup), catches bundle bloat and obvious local regressions before merge. Tools like webpack-bundle-analyzer or bundlewatch handle the bundle-size half; criterion-style benchmark runners handle the micro-benchmark half. Limited by inability to catch regressions that need production load to manifest.
Pre-merge query-shape analysis. Static analysis or tracing-based checks that flag new N+1 patterns in PRs. Tools like rails-bullet, prosopite, or custom ORM-level checks. Catches the dominant regression class without needing production load. Setup cost is moderate (2 to 6 weeks) but the catch rate on N+1 regressions is high.
Continuous load testing in a pre-production environment. A scheduled load test against a staging environment with production-representative data and traffic shape. Catches regressions that need real load to manifest. Setup cost is meaningful (4 to 8 weeks plus infrastructure spend); ongoing maintenance is real (1 to 2% of engineering capacity). The catch rate is high but the operational burden requires sustained commitment.
Production canary with performance alerting. Deploy the new version to a small fraction of production traffic, compare latency and error-rate percentiles to the previous version, automatically rollback on regression. The most expensive to set up but the most effective at catching regressions that only appear under real production conditions.
Real-user monitoring (RUM) with regression alerting. The last line of defence. Detects regressions after they have reached real users. Necessary as a safety net but should not be relied on as the primary detection mechanism, because by the time RUM flags a regression, some revenue impact has already been incurred.

The continuous performance testing investment

For organisations with material performance-sensitive workloads (e-commerce, adtech, real-time consumer apps, high-volume APIs), the per-PR bundle check plus continuous load testing in pre-production is usually the highest-leverage combination. The investment shape is typically 4 to 8 engineering weeks for initial build, 1 to 2% of ongoing engineering capacity in maintenance, and $20K to $80K per year in compute and tooling.

The avoided rework typically lands at 3 to 8% of total engineering capacity for organisations starting from a low-maturity baseline. The revenue-loss saving (avoided latency-driven conversion drops) is often larger than the labour saving for revenue-sensitive workloads. Payback is typically 12 to 24 months on labour alone, often faster when revenue impact is included. The measure page covers the metrics framework that makes the ROI calculation defensible at budget time.

Sources

Akamai. Online Retail Performance Report. Akamai, 2017.
Google. Find Out How You Stack Up to New Industry Benchmarks for Mobile Page Speed. Google / Think with Google, 2017.
Amazon. Historical internal latency-vs-conversion research, cited by Jeff Bezos in 2006 shareholder communications and subsequent industry references.
Google. Web Vitals documentation. web.dev/vitals.
Jones, C. Applied Software Measurement. 3rd ed. McGraw-Hill, 2008.

Frequently asked questions

What does it cost to fix a performance regression?▼

Wide range: 1 to 2 hours for a single-line N+1 fix; 4 to 12 engineering weeks for an architectural regression requiring caching or data-model change. Median per material regression: 30 to 80 engineering hours.

How much revenue does a 100ms latency increase cost?▼

E-commerce: roughly 1% conversion drop per 100ms (Amazon/Akamai). Bounce rate rises 32% as mobile page load goes from 1s to 3s (Google 2017). Sensitivity varies: e-commerce and adtech highest, B2B SaaS lower, internal tools mostly affect productivity not revenue.

Why are performance regressions hard to catch?▼

Three reasons: CI environments lack production-representative load; performance is multi-dimensional (latency, throughput, memory, cold-start); regressions often emerge from interaction effects between independently-correct changes.

What does continuous performance testing cost?▼

Initial setup: 4 to 8 engineering weeks. Ongoing maintenance: 1 to 2% of capacity. Infrastructure and tooling: $20K to $80K per year. Typical avoided rework: 3 to 8% of total engineering capacity. Payback: 12 to 24 months on labour alone.

What is the most common performance regression class?▼

N+1 query patterns, accounting for an estimated 35 to 50% of all material performance regressions in web and API workloads. Second-largest is uncached repeated computation.

Should every team have continuous performance testing?▼

No. The investment only pays back for teams with material performance-sensitive workloads. Internal tools, low-volume APIs, and batch-processing workloads usually find the per-PR bundle and benchmark check enough, with on-call response covering the rare production regression.