Request Hedging: Accelerate Your App by Racing Duplicate Calls
perficient.com
Users notice slow requests; even if 99 % finish quickly, that 1 % “long‑tail” latency can make your app feel sluggish. Request hedging solves this by speculatively firing a second duplicate after a short delay, racing to beat out outliers before they ever impact the UI.
- The time it takes for the slowest 1 % of requests to finish is known as P99 latency. (P99.9 is the slowest 0.1 %, and so on.)
- Users are sensitive to slowness. One long request is all it takes for an app to feel sluggish.
- In an architectures where a page render hits 50 microservices, one bad service can drag the whole page down.

Google’s Bigtable team discovered that firing a second copy of a read after just 10 milliseconds cut their P99.9 latency by 96 % while adding only 2 % extra traffic. That’s cheaper than a single extra VM instance and far ...
Copyright of this story solely belongs to perficient.com . To see the full text click HERE