Skip to content

feat(zero-cache): 2x faster initial sync#5704

Merged
tantaman merged 17 commits intomainfrom
mlaw/faster-initial
Apr 8, 2026
Merged

feat(zero-cache): 2x faster initial sync#5704
tantaman merged 17 commits intomainfrom
mlaw/faster-initial

Conversation

@tantaman
Copy link
Copy Markdown
Contributor

@tantaman tantaman commented Mar 27, 2026

Swaps to the binary protocol for copying from PG rather than using the text protocol.

  ┌──────────────┬──────────────────┬───────────────────────────┬─────────┐                                                 
  │              │ Main (text COPY) │ This branch (binary COPY) │ Speedup │                                                 
  ├──────────────┼──────────────────┼───────────────────────────┼─────────┤                                                 
  │ Total time   │ 19.45s           │ 9.86s                     │ 1.97x   │                                                 
  ├──────────────┼──────────────────┼───────────────────────────┼─────────┤                                                 
  │ Rate         │ 142.0K rows/s    │ 280.0K rows/s             │ 1.97x   │
  ├──────────────┼──────────────────┼───────────────────────────┼─────────┤                                                 
  │ Replica size │ 1256.9 MB        │ 1256.9 MB                 │ same    │
  └──────────────┴──────────────────┴───────────────────────────┴─────────┘   

@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
replicache-docs Ready Ready Preview, Comment Apr 8, 2026 6:14pm
zbugs Ready Ready Preview, Comment Apr 8, 2026 6:14pm

Request Review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 27, 2026

🐰 Bencher Report

Branchmlaw/faster-initial
TestbedLinux
Click to view all benchmark results
BenchmarkFile SizeBenchmark Result
kilobytes (KB)
(Result Δ%)
Upper Boundary
kilobytes (KB)
(Limit %)
zero-package.tgz📈 view plot
🚷 view threshold
1,963.09 KB
(+0.74%)Baseline: 1,948.69 KB
1,987.67 KB
(98.76%)
zero.js📈 view plot
🚷 view threshold
269.36 KB
(-0.02%)Baseline: 269.40 KB
274.79 KB
(98.02%)
zero.js.br📈 view plot
🚷 view threshold
71.45 KB
(+0.06%)Baseline: 71.41 KB
72.84 KB
(98.10%)
🐰 View full continuous benchmarking report in Bencher

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 27, 2026

🐰 Bencher Report

Branchmlaw/faster-initial
Testbedself-hosted-metal
Click to view all benchmark results
BenchmarkThroughputBenchmark Result
operations / second (ops/s)
(Result Δ%)
Lower Boundary
operations / second (ops/s)
(Limit %)
src/db/pg-copy.bench.ts > pg-copy benchmark > copy📈 view plot
🚷 view threshold
24.76 ops/s
(+4.93%)Baseline: 23.60 ops/s
22.80 ops/s
(92.11%)
🐰 View full continuous benchmarking report in Bencher

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 27, 2026

🐰 Bencher Report

Branchmlaw/faster-initial
Testbedself-hosted-metal
Click to view all benchmark results
BenchmarkThroughputBenchmark Result
operations / second (ops/s) x 1e3
(Result Δ%)
Lower Boundary
operations / second (ops/s) x 1e3
(Limit %)
src/client/custom.bench.ts > big schema📈 view plot
🚷 view threshold
117.27 ops/s x 1e3
(+7.84%)Baseline: 108.75 ops/s x 1e3
95.86 ops/s x 1e3
(81.74%)
src/client/zero.bench.ts > basics > All 1000 rows x 10 columns (numbers)📈 view plot
🚷 view threshold
1.51 ops/s x 1e3
(+0.03%)Baseline: 1.51 ops/s x 1e3
1.06 ops/s x 1e3
(69.93%)
src/client/zero.bench.ts > pk compare > pk = N📈 view plot
🚷 view threshold
37.82 ops/s x 1e3
(-0.06%)Baseline: 37.85 ops/s x 1e3
36.26 ops/s x 1e3
(95.87%)
src/client/zero.bench.ts > with filter > Lower rows 500 x 10 columns (numbers)📈 view plot
🚷 view threshold
1.89 ops/s x 1e3
(-8.24%)Baseline: 2.06 ops/s x 1e3
1.70 ops/s x 1e3
(89.75%)
🐰 View full continuous benchmarking report in Bencher

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 27, 2026

🐰 Bencher Report

Branchmlaw/faster-initial
Testbedself-hosted-metal

⚠️ WARNING: Truncated view!

The full continuous benchmarking report exceeds the maximum length allowed on this platform.

🚨 1 Alert

🐰 View full continuous benchmarking report in Bencher

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 27, 2026

🐰 Bencher Report

Branchmlaw/faster-initial
Testbedself-hosted-metal

🚨 3 Alerts

BenchmarkMeasure
Units
ViewBenchmark Result
(Result Δ%)
Lower Boundary
(Limit %)
src/btree-set.bench.ts > BTreeSet lookups > get() hitThroughput
operations / second (ops/s) x 1e6
📈 plot
🚷 threshold
🚨 alert (🔔)
8.76 ops/s x 1e6
(-9.67%)Baseline: 9.70 ops/s x 1e6
9.12 ops/s x 1e6
(104.08%)

src/btree-set.bench.ts > BTreeSet lookups > has() hitThroughput
operations / second (ops/s) x 1e6
📈 plot
🚷 threshold
🚨 alert (🔔)
8.79 ops/s x 1e6
(-9.36%)Baseline: 9.70 ops/s x 1e6
8.95 ops/s x 1e6
(101.79%)

src/btree-set.bench.ts > BTreeSet lookups > has() missThroughput
operations / second (ops/s) x 1e6
📈 plot
🚷 threshold
🚨 alert (🔔)
12.18 ops/s x 1e6
(-15.98%)Baseline: 14.50 ops/s x 1e6
14.11 ops/s x 1e6
(115.83%)

Click to view all benchmark results
BenchmarkThroughputBenchmark Result
operations / second (ops/s)
(Result Δ%)
Lower Boundary
operations / second (ops/s)
(Limit %)
src/btree-set.bench.ts > BTreeSet iterator next() in isolation > forward iterator next()📈 view plot
🚷 view threshold
143,710.65 ops/s
(+1.22%)Baseline: 141,980.12 ops/s
128,000.48 ops/s
(89.07%)
src/btree-set.bench.ts > BTreeSet iterator next() in isolation > forward iterator next() from mid📈 view plot
🚷 view threshold
280,828.44 ops/s
(+0.22%)Baseline: 280,223.88 ops/s
234,064.91 ops/s
(83.35%)
src/btree-set.bench.ts > BTreeSet iterator next() in isolation > reverse iterator next()📈 view plot
🚷 view threshold
134,292.77 ops/s
(-8.18%)Baseline: 146,264.03 ops/s
132,267.69 ops/s
(98.49%)
src/btree-set.bench.ts > BTreeSet iterator next() in isolation > reverse iterator next() from mid📈 view plot
🚷 view threshold
277,386.94 ops/s
(-0.92%)Baseline: 279,968.15 ops/s
226,656.70 ops/s
(81.71%)
src/btree-set.bench.ts > BTreeSet iterators > [Symbol.iterator]() full scan📈 view plot
🚷 view threshold
164,988.98 ops/s
(+4.04%)Baseline: 158,581.29 ops/s
123,253.38 ops/s
(74.70%)
src/btree-set.bench.ts > BTreeSet iterators > values() full scan📈 view plot
🚷 view threshold
182,384.20 ops/s
(-2.59%)Baseline: 187,233.28 ops/s
179,762.63 ops/s
(98.56%)
src/btree-set.bench.ts > BTreeSet iterators > valuesFrom() from mid📈 view plot
🚷 view threshold
374,337.93 ops/s
(+7.04%)Baseline: 349,713.35 ops/s
289,241.90 ops/s
(77.27%)
src/btree-set.bench.ts > BTreeSet iterators > valuesFromReversed() from mid📈 view plot
🚷 view threshold
320,399.75 ops/s
(+1.73%)Baseline: 314,960.17 ops/s
275,209.74 ops/s
(85.90%)
src/btree-set.bench.ts > BTreeSet iterators > valuesReversed() full scan📈 view plot
🚷 view threshold
187,849.77 ops/s
(+6.13%)Baseline: 177,001.46 ops/s
146,816.53 ops/s
(78.16%)
src/btree-set.bench.ts > BTreeSet lookups > get() hit📈 view plot
🚷 view threshold
🚨 view alert (🔔)
8,762,562.07 ops/s
(-9.67%)Baseline: 9,700,379.33 ops/s
9,120,244.59 ops/s
(104.08%)

src/btree-set.bench.ts > BTreeSet lookups > has() hit📈 view plot
🚷 view threshold
🚨 view alert (🔔)
8,791,066.91 ops/s
(-9.36%)Baseline: 9,699,087.61 ops/s
8,948,218.59 ops/s
(101.79%)

src/btree-set.bench.ts > BTreeSet lookups > has() miss📈 view plot
🚷 view threshold
🚨 view alert (🔔)
12,179,480.37 ops/s
(-15.98%)Baseline: 14,495,548.66 ops/s
14,107,577.28 ops/s
(115.83%)

src/btree-set.bench.ts > BTreeSet mutations > add() 100 sequential keys📈 view plot
🚷 view threshold
126,672.15 ops/s
(+1.25%)Baseline: 125,111.46 ops/s
114,748.95 ops/s
(90.59%)
src/btree-set.bench.ts > BTreeSet mutations > add() 1000 sequential keys📈 view plot
🚷 view threshold
11,826.07 ops/s
(+0.55%)Baseline: 11,761.14 ops/s
11,151.38 ops/s
(94.29%)
src/btree-set.bench.ts > BTreeSet mutations > add() then delete() single key📈 view plot
🚷 view threshold
7,099,424.74 ops/s
(+0.05%)Baseline: 7,095,536.65 ops/s
6,265,966.51 ops/s
(88.26%)
src/size-of-value.bench.ts > getSizeOfValue performance > arrays > large array (100 items)📈 view plot
🚷 view threshold
1,529,803.76 ops/s
(+2.83%)Baseline: 1,487,714.17 ops/s
1,337,694.86 ops/s
(87.44%)
src/size-of-value.bench.ts > getSizeOfValue performance > arrays > small array (10 items)📈 view plot
🚷 view threshold
12,808,242.62 ops/s
(+5.00%)Baseline: 12,198,516.54 ops/s
10,204,925.07 ops/s
(79.67%)
src/size-of-value.bench.ts > getSizeOfValue performance > datasets > large dataset (100x512B)📈 view plot
🚷 view threshold
39,544.04 ops/s
(+5.34%)Baseline: 37,539.61 ops/s
32,365.14 ops/s
(81.85%)
src/size-of-value.bench.ts > getSizeOfValue performance > datasets > small dataset (10x256B)📈 view plot
🚷 view threshold
393,486.16 ops/s
(+4.02%)Baseline: 378,290.11 ops/s
323,721.04 ops/s
(82.27%)
src/size-of-value.bench.ts > getSizeOfValue performance > objects > nested object📈 view plot
🚷 view threshold
3,443,162.57 ops/s
(+5.26%)Baseline: 3,271,114.74 ops/s
2,958,954.29 ops/s
(85.94%)
src/size-of-value.bench.ts > getSizeOfValue performance > objects > structured object (1KB)📈 view plot
🚷 view threshold
3,964,747.40 ops/s
(+5.34%)Baseline: 3,763,807.98 ops/s
3,267,332.31 ops/s
(82.41%)
src/size-of-value.bench.ts > getSizeOfValue performance > objects > structured object (256B)📈 view plot
🚷 view threshold
3,935,178.07 ops/s
(+4.76%)Baseline: 3,756,219.33 ops/s
3,276,421.78 ops/s
(83.26%)
src/size-of-value.bench.ts > getSizeOfValue performance > primitives > boolean📈 view plot
🚷 view threshold
72,397,961.07 ops/s
(+5.61%)Baseline: 68,554,440.80 ops/s
54,664,259.99 ops/s
(75.51%)
src/size-of-value.bench.ts > getSizeOfValue performance > primitives > integer📈 view plot
🚷 view threshold
73,797,140.03 ops/s
(+7.80%)Baseline: 68,455,110.17 ops/s
54,060,688.49 ops/s
(73.26%)
src/size-of-value.bench.ts > getSizeOfValue performance > primitives > null📈 view plot
🚷 view threshold
73,406,145.88 ops/s
(+7.71%)Baseline: 68,149,592.75 ops/s
54,530,370.41 ops/s
(74.29%)
src/size-of-value.bench.ts > getSizeOfValue performance > primitives > string (100 chars)📈 view plot
🚷 view threshold
804,265.21 ops/s
(+2.64%)Baseline: 783,613.98 ops/s
746,687.63 ops/s
(92.84%)
src/tdigest.bench.ts > TDigest Benchmarks > add📈 view plot
🚷 view threshold
2.44 ops/s
(+2.49%)Baseline: 2.38 ops/s
2.28 ops/s
(93.32%)
src/tdigest.bench.ts > TDigest Benchmarks > addCentroid📈 view plot
🚷 view threshold
1.93 ops/s
(+2.92%)Baseline: 1.87 ops/s
1.81 ops/s
(93.91%)
src/tdigest.bench.ts > TDigest Benchmarks > addCentroidList📈 view plot
🚷 view threshold
1.89 ops/s
(-0.45%)Baseline: 1.90 ops/s
1.85 ops/s
(97.92%)
src/tdigest.bench.ts > TDigest Benchmarks > merge > addCentroid📈 view plot
🚷 view threshold
13,792.74 ops/s
(-44.18%)Baseline: 24,707.68 ops/s
1,538.16 ops/s
(11.15%)
src/tdigest.bench.ts > TDigest Benchmarks > merge > merge📈 view plot
🚷 view threshold
16,907.22 ops/s
(-34.59%)Baseline: 25,848.95 ops/s
4,404.46 ops/s
(26.05%)
src/tdigest.bench.ts > TDigest Benchmarks > quantile📈 view plot
🚷 view threshold
2.36 ops/s
(+0.29%)Baseline: 2.35 ops/s
2.28 ops/s
(96.89%)
🐰 View full continuous benchmarking report in Bencher

@tantaman
Copy link
Copy Markdown
Contributor Author

tantaman commented Apr 8, 2026

I would someday like to understand what makes this 2x faster ... is it the amount of bytes streamed over the wire, which is a win for number-heavy tables, or something else? I feel like strings are the same size and get UTF-8 encoded in either format.

@darkgnotic - My theory is that it is faster for PG to get the data out via the binary format (https://github.com/postgres/postgres/blob/f8eec1ced6979157cbf517b5bbf617b82a01a397/src/backend/commands/copyto.c#L1421-L1450). IIUC, text copy has to scan every byte coming out to do proper escaping. Binary copy allows PG to just dump the raw bytes to the socket. This would mean that it wasn't the network that was the problem but the time to get bytes out of PG, which would look like the network to an external consumer. WDYT?

I think this also makes sense since gigabugs is only 1GB so it should transfer within 1-2 seconds, not the 10-20 seconds we see it take.

Measuring the data over the wire, the binary format is actually slightly larger for gigabugs than text. 1,000 MB vs 967 MB

@darkgnotic
Copy link
Copy Markdown
Contributor

I would someday like to understand what makes this 2x faster ... is it the amount of bytes streamed over the wire, which is a win for number-heavy tables, or something else? I feel like strings are the same size and get UTF-8 encoded in either format.

@darkgnotic - My theory is that it is faster for PG to get the data out via the binary format (https://github.com/postgres/postgres/blob/f8eec1ced6979157cbf517b5bbf617b82a01a397/src/backend/commands/copyto.c#L1421-L1450). IIUC, text copy has to scan every byte coming out to do proper escaping. Binary copy allows PG to just dump the raw bytes to the socket. This would mean that it wasn't the network that was the problem but the time to get bytes out of PG, which would look like the network to an external consumer. WDYT?

Ah, this would make sense. The escaping probably costs us a bit of time on the parsing side too.

I think this also makes sense since gigabugs is only 1GB so it should transfer within 1-2 seconds, not the 10-20 seconds we see it take.

Measuring the data over the wire, the binary format is actually slightly larger for gigabugs than text. 1,000 MB vs 967 MB

Yeah, that is what dissuaded me from pursuing this option way back in the day. But I didn't do the due diligence of actually benchmarking it end-to-end.

Thank you!

@tantaman tantaman added this pull request to the merge queue Apr 8, 2026
Merged via the queue into main with commit 7ac04e3 Apr 8, 2026
24 of 26 checks passed
@tantaman tantaman deleted the mlaw/faster-initial branch April 8, 2026 18:53
arv pushed a commit that referenced this pull request Apr 17, 2026
Swaps to the binary protocol for copying from PG rather than using the
text protocol.

```
  ┌──────────────┬──────────────────┬───────────────────────────┬─────────┐                                                 
  │              │ Main (text COPY) │ This branch (binary COPY) │ Speedup │                                                 
  ├──────────────┼──────────────────┼───────────────────────────┼─────────┤                                                 
  │ Total time   │ 19.45s           │ 9.86s                     │ 1.97x   │                                                 
  ├──────────────┼──────────────────┼───────────────────────────┼─────────┤                                                 
  │ Rate         │ 142.0K rows/s    │ 280.0K rows/s             │ 1.97x   │
  ├──────────────┼──────────────────┼───────────────────────────┼─────────┤                                                 
  │ Replica size │ 1256.9 MB        │ 1256.9 MB                 │ same    │
  └──────────────┴──────────────────┴───────────────────────────┴─────────┘   
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants