Load Test: connection_paths v1 vs v2
max_depth ∈ {1,2,3,4} × max_paths ∈ {25,100} × impl ∈ {v1,v2} on both local and sandbox5. Real data exposes v1's hash-join collapse.TL;DR
At depth=2 on real sandbox5 data, v2 is ~7.6× faster than v1 at p95 (5.0s vs 38.1s) and 15× more throughput (4.0 RPS vs 0.27 RPS). The gap is far larger than local synthetic data suggested (~1.5×) — v1's CTE hash-join cannot scale to real contact_connections volume. Legacy must be retired before any production rollout.
- 1Sandbox5 depth=2: v2 5.0 s p95 vs v1 38.1 s p95 — 7.6× faster, 15× throughput.
- 2v2 depth=1 is essentially free — 600–700 ms p95 on sandbox5 at ~18 RPS sustained.
- 3v2 depth=3 is acceptable for exploration (20–30 s p95). Depth=4 is research-grade (23–50 s, wide variance).
- 4Local numbers are misleading — Rails dev mode (1 Puma worker) inverts the usual "local is faster" intuition. Trust sandbox5 numbers.
1. Setup
| Param | Local | Sandbox5 |
|---|---|---|
| Backend | Rails dev (1 Puma worker) — api.local.getro.dev | Prod-like — api.sandbox5.getro.dev |
| DB | Local Postgres, ~thousand contact_connections rows | Real contact_connections (~12.4M rows) |
| Collection | 4386 "Perf collection" — 500k shared UCC | 728 "Getro.org" — real data |
| Auth | user 40 (enaho@getro.com) | user 271281 (Enaho on sandbox5) |
| VUs / Duration / cell | 10 VUs / 20s | 10 VUs / 20s |
| Sweep | max_depth ∈ {1,2,3,4} × max_paths ∈ {25,100} × impl ∈ {v1,v2} | |
| Driver | backend/load/sweep.sh (with ENV=local|sandbox5) | |
max_depth, so v1 cells only run at depth=2; depths 1, 3, 4 are v2-only.Target contacts
Local (collection 4386): 3349:wren_133, 3338:ellis_109, 3300:logan_94, 3690:jordan_106, 3439:emerson_117, 1033791:perfsynth1054_171, 1033638:perfsynth901_107
Sandbox5 (collection 728): 3083:sebastian_15k, 14080:kristin_11k, 23352:abdulrahman_13k, 84573:chris_225, 372722:gheran_41, 158317:enaho_151
2. Headline: v1 vs v2 at depth=2
Apples-to-apples comparison. Legacy can only do depth=2; v2 is benchmarked at the same depth to match.
| env | impl | max_paths | p50 (ms) | p90 (ms) | p95 (ms) | avg (ms) | RPS | reqs |
|---|---|---|---|---|---|---|---|---|
| local | v1 | 25 | 3,175 | 3,816 | 3,832 | 3,135 | 2.6 | 93 |
| local | v1 | 100 | 2,858 | 3,384 | 3,443 | 2,557 | 3.1 | 113 |
| local | v2 | 25 | 1,983 | 2,463 | 2,627 | 1,772 | 4.6 | 161 |
| local | v2 | 100 | 1,998 | 2,278 | 2,572 | 1,718 | 4.7 | 166 |
| sandbox5 | v1 | 25 | 32,903 | 36,641 | 38,088 | 28,360 | 0.3 | 13 |
| sandbox5 | v1 | 100 | 11,862 | 30,670 | 33,541 | 16,458 | 0.2 | 11 |
| sandbox5 | v2 | 25 | 1,700 | 2,668 | 5,037 | 1,952 | 4.0 | 142 |
| sandbox5 | v2 | 100 | 2,622 | 4,478 | 4,907 | 2,811 | 2.8 | 100 |
idx_cc_walk_a/b keep v2 fast.3. v2 sweep — sandbox5 (real data)
| max_depth | max_paths | reqs | RPS | p50 (ms) | p90 (ms) | p95 (ms) | avg (ms) |
|---|---|---|---|---|---|---|---|
| 1 (direct) | 25 | 615 | 17.5 | 404 | 586 | 702 | 445 |
| 1 (direct) | 100 | 652 | 18.6 | 401 | 549 | 609 | 421 |
| 2 (1-hop intro) | 25 | 142 | 4.0 | 1,700 | 2,668 | 5,037 | 1,952 |
| 2 (1-hop intro) | 100 | 100 | 2.8 | 2,622 | 4,478 | 4,907 | 2,811 |
| 3 (2-hop intro) | 25 | 26 | 0.58 | 13,223 | 18,411 | 19,492 | 13,020 |
| 3 (2-hop intro) | 100 | 20 | 0.39 | 19,419 | 29,179 | 29,595 | 18,903 |
| 4 (3-hop intro) | 25 | 41 | 0.76 | 6,881 | 21,387 | 22,987 | 9,307 |
| 4 (3-hop intro) | 100 | 5 | 0.08 | 14,402 | 42,364 | 50,915 | 23,107 |
Observations
- Depth 1 is essentially free. Pure UCC join with no recursion. 600–700 ms p95 at ~18 RPS sustained.
- Depth 2 is the production sweet spot. 5 s p95 — well above the original 250 ms SLO but under the 1500 ms backend timeout. Should improve as the DB buffer cache warms.
- Depth 3 doubles-to-triples latency. 20–30 s p95. Workable for "show me deeper paths" exploration, not for routine reads.
- Depth 4 has wide variance. 23–50 s p95 with
max_paths=100and small samples. The fanout cap (MAX_EDGES_PER_HOP=25) is doing its job (no >60 s timeouts), but the absolute latency is risky.
4. v2 sweep — local (synthetic Perf collection)
| max_depth | max_paths | reqs | RPS | p50 (ms) | p90 (ms) | p95 (ms) | avg (ms) |
|---|---|---|---|---|---|---|---|
| 1 | 25 | 258 | 7.4 | 1,319 | 1,358 | 1,385 | 1,091 |
| 1 | 100 | 260 | 7.4 | 1,309 | 1,413 | 1,446 | 1,085 |
| 2 | 25 | 161 | 4.6 | 1,983 | 2,463 | 2,627 | 1,772 |
| 2 | 100 | 166 | 4.7 | 1,998 | 2,278 | 2,572 | 1,718 |
| 3 | 25 | 48 | 1.1 | 8,289 | 9,793 | 10,036 | 6,810 |
| 3 | 100 | 39 | 0.8 | 10,223 | 14,406 | 15,492 | 9,190 |
| 4 | 25 | 32 | 0.7 | 12,584 | 13,966 | 14,460 | 10,611 |
| 4 | 100 | 18 | 0.3 | 24,422 | 32,847 | 33,012 | 22,023 |
5. Implications for production rollout
- v2 is non-negotiable. v1's 38 s p95 on sandbox5 would saturate Puma workers and page within minutes of real traffic. Roll out via the
relationship_strengthfeature flag — flip per collection as v2 perf is verified. - Depth=2 is the production contract. Matches legacy semantic and lands ~5 s p95 on real data.
ANALYZEafter backfill + buffer-cache warming should push this closer to 2–3 s. - Depth=3 is acceptable for exploratory surfaces — playground, admin tools — but not for tab-loading reads. ~20–30 s p95.
- Depth=4 is research-grade. Variable, sometimes 50 s. Keep behind explicit user action ("show deeper paths"), not on default loads.
max_pathsmatters far less thanmax_depth. At depth=2,max_paths=25vs100are nearly identical (5.0 vs 4.9 s p95). The expensive work is the walk, not the result sort.
6. How to reproduce
Prereqs
- Docker dev environment up (
make up) - k6 installed (
brew install k6) - For sandbox5: a Bearer JWT from a confirmed user with shared UCC access in collection 728 (use admin-portal devtools snippet in the playground)
Local sweep
VUS=10 DURATION=20s ./backend/load/sweep.sh
# outputs:
# backend/load/results/local/sweep-summary.csv
# backend/load/results/local/sweep-summary.md
# backend/load/results/local/raw/{impl}_d{depth}_p{paths}.json
Sandbox5 sweep
ENV=sandbox5 TOKEN=eyJ... VUS=10 DURATION=20s ./backend/load/sweep.sh
# outputs go to backend/load/results/sandbox5/
Narrow sweep
DEPTHS="2,3" PATHS_SWEEP="100" IMPLS="v2" ./backend/load/sweep.sh
Single-cell comparison (v1 vs v2)
./backend/load/compare_v1_v2.sh
7. Caveats
- Sandbox5 v1 cells have tiny sample sizes (11–13 reqs / 20s) because each request takes 30+ seconds. p95 is "the slowest of the few we got". Production load would be far worse.
- Buffer cache state varies between runs. First requests after deploy/restart pay disk I/O; warm cache is much faster. Sandbox5 v2 depth=1 numbers (600 ms) likely benefit from warm cache.
- Token expiry: 15-min TTL. Long sweeps need a fresh token midway, or re-mint between cells.
- Rails dev mode = 1 Puma worker. Local sweep results are good for testing the code path but not for absolute latency.
Cross-references
- Single-request perf benchmark: performance.html
- Backfill scale & prod planning: backfill-scale.html
- Interactive playground: playground.html
- Source:
backend/load/sweep.sh,backend/load/connection_paths.k6.js,backend/load/compare_v1_v2.sh - v2 service:
backend/app/services/contacts/connection_paths/finder_v2.rb - Routing controller:
backend/app/controllers/api/v2/collections/contacts/connection_paths_controller.rb
10. Optimization journey (phases)
Iterative perf work. After each phase we re-run the sweep on local + sandbox5 and capture the delta. The numbers at the top of this doc are Phase 0 (baseline) — uniform MAX_EDGES_PER_HOP=25. Each phase below replaces or adds to the baseline.
Phase 1 — Lower per-hop edge cap (MAX_EDGES_PER_HOP: 25 → 15)
What changed: Recursive walk now follows at most 15 most-recent edges per hop instead of 25. Bounds worst-case walk rows at depth 3 from 15,625 → 3,375 (4.6× less work).
CASE WHEN b.depth = 0 THEN 25 ELSE 10 END LIMIT, but Postgres rejects CASE expressions in LIMIT that reference outer-row columns from a lateral subquery. Uniform 15 is the simplest workable approximation.Sandbox5 sweep — Phase 0 vs Phase 1 (real data)
| cell | Phase 0 p95 | Phase 1 p95 | Δ | RPS Δ |
|---|---|---|---|---|
| v1 d=2 mp=25 | 38,088 ms | 44,958 ms | worse (small sample) | 0.27 → 0.20 |
| v1 d=2 mp=100 | 33,541 ms | 35,192 ms | similar | 0.17 → 0.33 |
| v2 d=1 mp=25 | 702 ms | 841 ms | similar | 17.5 → 16.6 |
| v2 d=1 mp=100 | 609 ms | 524 ms | −14% | 18.6 → 22.7 (+22%) |
| v2 d=2 mp=25 | 5,037 ms | 2,001 ms | −60% (2.5× faster) | 4.0 → 6.8 (+70%) |
| v2 d=2 mp=100 | 4,907 ms | 4,607 ms | −6% | 2.8 → 3.9 |
| v2 d=3 mp=25 | 19,492 ms | 17,257 ms | −11% | 0.58 → 0.79 |
| v2 d=3 mp=100 | 29,595 ms | 26,955 ms | −9% | 0.39 → 0.46 |
| v2 d=4 mp=25 | 22,987 ms | 21,867 ms | similar | 0.76 → 0.59 |
| v2 d=4 mp=100 | 50,915 ms | 45,593 ms | −10% | 0.08 → 0.29 |
Local sweep skipped from the analysis — Rails dev mode (1 Puma worker, cold caches, background-process noise) makes cap-tuning numbers too volatile to read. See full local CSV at backend/load/results/local/sweep-summary.csv for the data, but trust sandbox5 for the verdict.
Commit: e1ddc72845 — deep_finder: lower MAX_EDGES_PER_HOP 25 → 15.
Phase 2 — Broaden in-network UCC partial index to all NETWORK_SOURCES shipped
15^N to 2 × 15^(N/2). In practice it doesn't work for seed_scope=:any (the production default) because the seed side isn't bounded by 15 — it's bounded by the collection's curated-contact count. Sandbox5 collection 728 has 459,300 distinct curated contacts across NETWORK_SOURCES; the forward walk is therefore O(S × 15f) where S ≈ 459k, not O(15f). The "30× win at depth=3" complexity model assumed S ≈ 15 — off by four orders of magnitude.EXISTS with IN (curated). WITH MATERIALIZED forced a 459k-row scan inside every recursive lateral iteration — depth=4 went from 2.4 s to 141 s (~60× slower) on target 3083. The materialized hash isn't reusable across lateral iterations the way a btree-index probe is.What actually shipped: the existing partial index index_ucc_in_network_membership was scoped to source = 5 (shared) only. Commit c02cafdd72 had broadened DeepFinder's in-network filter to source IN (0, 1, 2, 3, 5, 10) weeks earlier, but the partial predicate was never updated. With the predicates out of sync the planner couldn't use the partial index and fell back to index_user_contact_collections_on_contact_id (wide, single column) plus an in-row filter — one heap fetch per probe.
The fix is a one-line predicate broadening:
# db/migrate/20260515121009_broaden_in_network_membership_index_to_network_sources.rb
remove_index :user_contact_collections, name: 'index_ucc_in_network_membership', algorithm: :concurrently
add_index :user_contact_collections, [:contact_id, :collection_id],
name: 'index_ucc_in_network_membership',
where: 'source IN (0, 1, 2, 3, 5, 10) AND user_id IS NOT NULL',
algorithm: :concurrently
No SQL change. DeepFinder is unchanged; the planner picks the broadened partial automatically once source IN (...) is subsumed by the partial WHERE. Confirmed via EXPLAIN ANALYZE on six sandbox5 targets — every plan now reads index_ucc_in_network_membership.
Sandbox5 single-shot EXPLAIN ANALYZE at depth=4 (cold cache)
| target (edges in graph) | Phase 1 (wide idx) | Phase 2 (partial idx) | speedup |
|---|---|---|---|
| 3083 (15,307) | 2,399 ms | 1,121 ms | 2.1× |
| 14080 (11,460) | 524 ms | 121 ms | 4.3× |
| 23352 (13,319) | 1,632 ms | 894 ms | 1.8× |
| 84573 (225) | 1,137 ms | 863 ms | 1.3× |
| 372722 (41) | 2,280 ms | 893 ms | 2.6× |
| 158317 (151) | 929 ms | 516 ms | 1.8× |
Sandbox5 sweep — Phase 1 vs Phase 2 (10 VUs · 20 s · 6 contacts)
| cell | Phase 1 p95 | Phase 2 p95 | Δ | RPS Δ |
|---|---|---|---|---|
| v2 d=1 mp=100 | 524 ms | 400 ms | −24% | 22.7 → 26.9 |
| v2 d=2 mp=25 | 2,001 ms | 1,976 ms | similar (saturated) | 6.8 → 7.0 |
| v2 d=2 mp=100 | 4,607 ms | 2,015 ms | −56% (2.3× faster) | 3.9 → 7.5 |
| v2 d=3 mp=25 | 17,257 ms | 9,431 ms | −45% (1.8× faster) | 0.79 → 1.17 |
| v2 d=3 mp=100 | 26,955 ms | 18,483 ms | −31% | 0.46 → 0.76 |
| v2 d=4 mp=25 | 21,867 ms | 14,600 ms | −33% | 0.59 → 0.86 |
| v2 d=4 mp=100 | 45,593 ms | 23,528 ms | −48% (1.9× faster) | 0.29 → 0.55 |
| v1 d=2 mp=25 | 44,958 ms | 29,456 ms | −34% | 0.20 → 0.43 |
deep_finder.timeout or hitting per-app-server resource limits. Depth=4 is still not production-ready in absolute terms. The dominant cost is no longer index-seek time; it's the sheer number of EXISTS probes (~100k at depth=4) and final-sort over the result set. Further wins at depth 4 likely require a different attack — precomputed 2-hop adjacency, Redis cache, or product-side decision to cap at depth 3.Commit: 0a89e1f734 — ucc: broaden in_network_membership partial to NETWORK_SOURCES.
Phase 3 — Reverse the UCC partial-index column order shipped
Hypothesis: Phase 2's partial is (contact_id, collection_id) — fine for "is this contact in any collection", but our actual access pattern is the opposite. DeepFinder's recursive lateral does ~100k EXISTS probes per request, all with the same fixed collection_id and ~100k different contact_ids. Putting collection_id first in the composite means every row for collection 728 lives in a contiguous btree slice. Once that slice is in the buffer pool, every probe within the request is a hot-cache hit.
Shape: a sibling partial index with reversed column order. Same WHERE clause. Old index kept in place — planner picks whichever wins on stats.
# db/migrate/20260515160233_add_ucc_collection_contact_in_network_index.rb
add_index :user_contact_collections, [:collection_id, :contact_id],
name: 'index_ucc_collection_contact_in_network',
where: 'source IN (0, 1, 2, 3, 5, 10) AND user_id IS NOT NULL',
algorithm: :concurrently
Original estimate was 1.5–2.5× cold-cache win. Actual win was an order of magnitude bigger — the warm-slice effect compounds across the ~3,500 lateral iterations at depth=4.
Sandbox5 single-shot EXPLAIN ANALYZE at depth=4 (warm cache, 3 runs averaged)
| target (edges) | Phase 1 (wide idx) | Phase 2 (partial idx) | Phase 3 (reversed) | Phase 3 vs Phase 2 |
|---|---|---|---|---|
| 3083 (15,307) | 2,399 ms | 1,121 ms | 48 ms | 23× |
| 14080 (11,460) | 524 ms | 121 ms | 51 ms | 2.4× |
| 23352 (13,319) | 1,632 ms | 894 ms | 53 ms | 17× |
| 84573 (225) | 1,137 ms | 863 ms | 50 ms | 17× |
| 372722 (41) | 2,280 ms | 893 ms | 43 ms | 21× |
| 158317 (151) | 929 ms | 516 ms | 58 ms | 9× |
Cold-cache on the very first run was 1–2 s (slice load). Every subsequent run is 40–60 ms — the collection's btree slice stays in the buffer cache.
Sandbox5 sweep — Phase 2 vs Phase 3 (10 VUs · 20 s · 6 contacts)
| cell | Phase 2 p95 | Phase 3 p95 | speedup | reqs | RPS |
|---|---|---|---|---|---|
| v2 d=1 mp=100 | 400 ms | 318 ms | 1.3× | 1,014 | 28.9 |
| v2 d=2 mp=25 | 1,976 ms | 329 ms | 6× | 1,015 | 28.8 |
| v2 d=2 mp=100 | 2,015 ms | 305 ms | 6.6× | 916 | 20.7 |
| v2 d=3 mp=25 | 9,431 ms | 375 ms | 25× | 961 | 27.3 |
| v2 d=3 mp=100 | 18,483 ms | 495 ms | 37× | 901 | 25.6 |
| v2 d=4 mp=25 | 14,600 ms | 341 ms | 43× | 991 | 28.2 |
| v2 d=4 mp=100 | 23,528 ms | 314 ms | 75× | 1,030 | 29.4 |
| v1 d=2 mp=25 | 29,456 ms | 327 ms | 90× | 1,015 | 28.9 |
| v1 d=2 mp=100 | 36,737 ms | 318 ms | 116× | 1,030 | 29.4 |
backend/load/results/sandbox5/sweep-summary.csv (since overwritten by later sweeps). But we have not been able to reproduce these numbers on subsequent sweeps against the same commit. Identical code, same DB, freshly restarted backend container → 580 ms p95 at d=1 to 6.4 s at d=4 mp=100 (post-Phase-4) instead of 305–495 ms across the board.Most likely cause: the original sweep landed during a window when sandbox5's backend was autoscaled to multiple containers, so the 10 k6 VUs were distributed across more Puma workers. Today's sandbox5 sits at
running/desired 1/1 — one container, one Puma worker, 5 threads — and 10 concurrent VUs saturate the queue. Secondary likely contributor: the PG buffer pool was already hot for collection 728's UCC slice from prior work that day.Neither factor is reproducible on demand from outside, and CloudWatch history at the granularity we'd need probably wasn't retained. The Phase 3 index gains are still real (DB query is 60 ms warm, confirmed by EXPLAIN ANALYZE multiple times since); what's not reproducible is the end-to-end sweep behavior under that specific cluster state. Production has multi-container autoscaling by default, so the upper-bound performance shown here is more representative of prod-shape behavior than the single-container re-benches are.
Why the win was bigger than predicted: the pre-shipping estimate assumed btree-pages-loaded-into-buffer would be a marginal effect on top of an already-cached index. In reality, the (contact_id, collection_id) partial scattered collection 728's rows across the entire btree's leaf pages (one slice per contact_id, of which there are 459k). The reversed (collection_id, contact_id) partial collapses all of collection 728 into one contiguous slice (~459k rows in a few MB of leaf pages). The buffer-pool footprint per request collapses by orders of magnitude. Locality won far harder than expected.
index_ucc_in_network_membership partial is still in place; pg_stat_user_indexes after one production cycle will tell us whether it's still earning its disk. Result-set work (Ruby sort, hydrate, serialize) is now a meaningful fraction of total request time — future optimization can focus there if the SLA budget tightens.Commit: 0fc6718886 — ucc: reversed partial index (collection_id, contact_id) for in-network EXISTS.
Phase 4 — Two Ruby-side fixes (over-fetch bug + push sort into SQL) shipped
Phase 3 looked complete. Then we re-benched on a different day and the same code returned 30-second p95 numbers instead of 300 ms. Two issues compounded:
- Sandbox5 was running with one backend container (
desired/running 1/1). The original Phase 3 sweep had landed during a window when sandbox5 was autoscaled to a larger fleet — that's why throughput sustained 29 RPS. With one container and 10 VUs from k6, the per-container queue depth explodes. - A real code-level over-fetch bug in
FinderV2was multiplying response sizes 5–10× atmax_depth > 2. Single-container saturation made the bug visible; multi-container autoscaling had been hiding it.
Both fixes shipped together as Phase 4.
4a. FinderV2 over-fetch bug at max_depth > 2
FinderV2#call multiplies DeepFinder's limit by OVER_FETCH_MULTIPLIER = 10 to give the legacy depth=2 sort + grouping a wider candidate pool. That logic only runs for depth=2; for depth>2 the early-return passes deep_result.paths through unchanged. But the over-fetch was still happening, so the API was returning 5–10× more paths than the user asked for.
Concretely, for max_depth=4 max_paths=100:
| stage | with bug | after fix |
|---|---|---|
FinderV2 → DeepFinder limit | 1,000 | 100 |
| DeepFinder SQL fetch cap | 500 × 25 = 12,500 | 100 × 25 = 2,500 |
| DeepFinder hydrated paths | 500 | 100 |
| API response paths | 500 | 100 |
| JSONAPI serializer work | 5× too much | correct |
The fix is one conditional in FinderV2:
deep_limit = max_depth > 2 ? limit : limit * OVER_FETCH_MULTIPLIER
For depth>2 callers, request limit directly; for depth=2 keep the legacy over-fetch (still needed for the grouping pool).
Commit: c7dd2b24a3 — finder_v2: skip over-fetch at depth>2 (was 5-10x response bloat).
4b. Push DeepFinder sort into SQL + drop SQL_OVER_FETCH_MULTIPLIER 25→5
Originally DeepFinder over-fetched 25× the public limit so Ruby could sort the richer pool by [depth, recency, user-vantage]. That meant a max_paths=100 request parsed 2,500 PG-array rows in Ruby even though only 100 ever made it to the response.
The recency tiebreaker (max of date_to across the path, NULL treated as today) and user-vantage tiebreaker (ucc.user_id = ?) are both expressible in SQL:
ORDER BY
(b.depth + 1) ASC,
(SELECT max(COALESCE(t, CURRENT_DATE)) FROM unnest(b.to_path) AS t) DESC NULLS LAST,
CASE WHEN ucc.user_id = ? THEN 0 ELSE 1 END
LIMIT ?
With SQL doing the sort, the over-fetch drops to 5× (still needed for the truncated/total reporting), Ruby parses 5× fewer rows, no Ruby re-sort.
Commit: 561243c6a9 — deep_finder: push sort into SQL ORDER BY, drop over-fetch 25x -> 5x.
Phase 4 measured win (sandbox5, 10 VUs · 20 s · single backend container)
Same sandbox5 single-container state for all three columns; same load shape; same code path. The first column is the broken state that surfaced after the FinderV2 bug had time to compound under saturation:
| cell | Re-bench (both bugs) | After fix 4a only | After both fixes | total speedup |
|---|---|---|---|---|
| v2 d=1 mp=25 | 648 ms | 814 ms | 578 ms | 1.1× |
| v2 d=1 mp=100 | 561 ms | 1,147 ms | 687 ms | similar |
| v2 d=2 mp=25 | 1,514 ms | 3,301 ms | 2,339 ms | similar |
| v2 d=2 mp=100 | 4,497 ms | 4,338 ms | 3,635 ms | 1.2× |
| v2 d=3 mp=25 | 9,572 ms | 2,416 ms | 2,086 ms | 4.6× |
| v2 d=3 mp=100 | 24,116 ms | 6,724 ms | 7,772 ms | 3.1× |
| v2 d=4 mp=25 | 20,990 ms | 3,635 ms | 2,263 ms | 9.3× |
| v2 d=4 mp=100 | 24,925 ms | 7,924 ms | 6,403 ms | 3.9× |
RPS climbed proportionally:
- v2 d=4 mp=25: 0.71 → 6.44 RPS (9×)
- v2 d=4 mp=100: 0.22 → 2.04 RPS (9×)