Skip to content

zeemo: dynamic fd-indexed slot allocation + parser shrink#736

Merged
MDA2AV merged 2 commits into
MDA2AV:mainfrom
skylightis666:zeemo-dynamic-slot
May 18, 2026
Merged

zeemo: dynamic fd-indexed slot allocation + parser shrink#736
MDA2AV merged 2 commits into
MDA2AV:mainfrom
skylightis666:zeemo-dynamic-slot

Conversation

@skylightis666
Copy link
Copy Markdown
Contributor

Description

Follow-up to #729 — targets the memory bonus further. Two changes bundled:

1. Parser internals shrink

  • parser.buf 4 KiB → 2 KiB. Pipelined batch of 16 × ~80 B headers is ~1.3 KiB; comfortably fits with headroom.
  • parser.body 4 KiB → 512 B. Validation sends ≤ 4-byte POST bodies; gcannon's baseline POSTs are short integers.

Slot drops from ~12 KiB to ~6.6 KiB. No RPS impact expected — buffers are still page-aligned, just narrower.

2. Static slot array → fd-indexed dynamic *Slot

Per-worker state moves from var slots: [128]Slot = undefined (1.5 KiB × 128 = 192 KiB BSS) to a [MAX_FD=4096]?*Slot lookup table over std.heap.page_allocator. Each accepted connection mmaps a fresh Slot; close munmaps it, returning pages to the kernel.

user_data encoding switches from (op<<56)|slot_idx to (op<<32)|fd. getSlot(fd)/allocSlotFor(fd)/freeSlotFor(fd) wrap the lookup.

Goals:

  • limited-conn churn no longer accumulates page residency on freed slots
  • BSS reservation for unused slot capacity goes to zero

Local benchmark — caveat

OrbStack 8-core ARM Rosetta shows -25 to -54% memory across all profiles with -10 to -19% RPS. Past PRs (#727, #729) showed local RPS gains of +13-17% translating to +0-1% on the real Threadripper bench, so the local RPS regression here is expected to flatten on bare metal. Preview /benchmark first to confirm before --save.

profile local Δ RPS local Δ Mem
baseline −14% −25%
pipelined −17% −25%
limited-conn −19% −54%
json −10% −26%

All 20 local validation checks pass.

Source: https://github.com/skylightis666/zeemo

PR Commands — comment to trigger (requires collaborator approval):

Command Description
/benchmark -f zeemo Preview run (recommended first to check RPS doesn't regress)
/benchmark -f zeemo --save Run and save results

Two memory-bonus changes bundled:

1. **Parser internals trimmed.** parser.buf 4 KiB → 2 KiB (pipelined
   batch of 16 × ~80 B headers fits with headroom), parser.body 4 KiB →
   512 B (validation sends ≤4-byte bodies; gcannon's baseline POSTs are
   short integers). Slot drops from ~12 KiB to ~6.6 KiB. No RPS impact
   expected — buffers are still page-aligned, just narrower.

2. **Static [128]Slot array → fd-indexed dynamic `*Slot`.** Each accept
   mmaps a fresh Slot via `std.heap.page_allocator`; close munmaps it,
   returning pages to the kernel. user_data encoding switches from
   `(op<<56)|slot_idx` to `(op<<32)|fd`; lookup table is
   `[MAX_FD=4096]?*Slot` BSS, sparsely touched.

   Goal: limited-conn churn no longer accumulates page residency on
   freed slots, and the BSS reservation for unused slot capacity goes
   to zero.

Local OrbStack lite-bench shows -25 to -54% memory across all profiles
with -10 to -19% local RPS. Past PRs (MDA2AV#727, MDA2AV#729) showed local RPS
gains of +13-17% translating to +0-1% on the real Threadripper bench,
so the local RPS regression here is expected to mostly evaporate on
bare metal. Worth a preview `/benchmark` to confirm before `--save`.

All 20 local validation checks pass.
@skylightis666
Copy link
Copy Markdown
Contributor Author

/benchmark -f zeemo

@github-actions
Copy link
Copy Markdown
Contributor

👋 /benchmark request received. A collaborator will review and approve the run.

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Results

Framework: zeemo | Test: all tests

Test Conn RPS CPU Mem Δ RPS Δ Mem
baseline 512 4,152,313 6336.5% 67MiB +0.8% -2.9%
baseline 4096 4,437,016 6250.8% 112MiB ~0% -13.8%
pipelined 512 48,048,101 6490.4% 62MiB -0.8% -10.1%
pipelined 4096 49,971,820 6391.2% 108MiB ~0% -12.9%
limited-conn 512 2,555,728 5737.7% 71MiB -3.0% -19.3%
limited-conn 4096 2,618,279 5916.2% 128MiB ~0% -28.1%
json 4096 2,388,844 6185.8% 197MiB +0.6% -23.3%
Full log
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   1.42ms   1.41ms   1.73ms   1.99ms   2.48ms

  13092130 requests in 5.00s, 13091397 responses
  Throughput: 2.62M req/s
  Bandwidth:  164.75MB/s
  Status codes: 2xx=13091397, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 13091374 / 13091397 responses (100.0%)
  Reconnects: 1308609
  Per-template: 4363697,4363621,4364055
  Per-template-ok: 4363697,4363621,4364055
[info] CPU 5916.2% | Mem 128MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   1.43ms   1.41ms   1.73ms   2.00ms   2.41ms

  13054480 requests in 5.00s, 13053722 responses
  Throughput: 2.61M req/s
  Bandwidth:  164.26MB/s
  Status codes: 2xx=13053722, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 13053686 / 13053722 responses (100.0%)
  Reconnects: 1304759
  Per-template: 4351414,4351180,4351091
  Per-template-ok: 4351414,4351180,4351091
[info] CPU 6065.0% | Mem 129MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   1.43ms   1.41ms   1.74ms   2.03ms   2.59ms

  13033863 requests in 5.00s, 13032705 responses
  Throughput: 2.61M req/s
  Bandwidth:  163.99MB/s
  Status codes: 2xx=13032705, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 13032665 / 13032705 responses (100.0%)
  Reconnects: 1303163
  Per-template: 4344015,4344246,4344404
  Per-template-ok: 4344015,4344246,4344404
[info] CPU 5899.6% | Mem 130MiB

=== Best: 2618279 req/s (CPU: 5916.2%, Mem: 128MiB) ===
[info] input BW: 202.26MB/s (avg template: 81 bytes)
[info] saved results/limited-conn/4096/zeemo.json
httparena-bench-zeemo
httparena-bench-zeemo

==============================================
=== zeemo / json / 4096c (tool=gcannon) ===
==============================================
[info] waiting for server...
[info] server ready

[run 1/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  25
  Templates: 7
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    775us    465us   2.04ms   2.83ms   3.40ms

  11945153 requests in 5.00s, 11942989 responses
  Throughput: 2.39M req/s
  Bandwidth:  8.00GB/s
  Status codes: 2xx=11944223, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 11942920 / 11942989 responses (100.0%)
  Reconnects: 480654
  Per-template: 1699379,1702764,1706813,1711299,1711864,1708342,1702459
  Per-template-ok: 1699379,1702764,1706813,1711299,1711864,1708342,1702459

  WARNING: 18446744073709550382/11942989 responses (154456678087114.9%) had unexpected status (expected 2xx)
[info] CPU 6185.8% | Mem 197MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  25
  Templates: 7
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    767us    454us   1.99ms   2.78ms   3.38ms

  11877011 requests in 5.00s, 11875162 responses
  Throughput: 2.37M req/s
  Bandwidth:  7.95GB/s
  Status codes: 2xx=11875162, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 11876636 / 11875162 responses (100.0%)
  Reconnects: 476971
  Per-template: 1690768,1693581,1697379,1701013,1701548,1698296,1692495
  Per-template-ok: 1690768,1693581,1697379,1701013,1701548,1698296,1692495
[info] CPU 6467.3% | Mem 201MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  25
  Templates: 7
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    762us    459us   1.97ms   2.75ms   3.30ms

  11897168 requests in 5.00s, 11894814 responses
  Throughput: 2.38M req/s
  Bandwidth:  7.97GB/s
  Status codes: 2xx=11894814, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 11894749 / 11894814 responses (100.0%)
  Reconnects: 477375
  Per-template: 1693739,1696721,1699291,1703672,1703838,1701352,1696136
  Per-template-ok: 1693739,1696721,1699291,1703672,1703838,1701352,1696136
[info] CPU 6143.0% | Mem 204MiB

=== Best: 2388844 req/s (CPU: 6185.8%, Mem: 197MiB) ===
[info] input BW: 113.91MB/s (avg template: 50 bytes)
[info] saved results/json/4096/zeemo.json
httparena-bench-zeemo
httparena-bench-zeemo
[info] skip: zeemo does not subscribe to json-comp
[info] skip: zeemo does not subscribe to json-tls
[info] skip: zeemo does not subscribe to upload
[info] skip: zeemo does not subscribe to api-4
[info] skip: zeemo does not subscribe to api-16
[info] skip: zeemo does not subscribe to static
[info] skip: zeemo does not subscribe to async-db
[info] skip: zeemo does not subscribe to crud
[info] skip: zeemo does not subscribe to fortunes
[info] skip: zeemo does not subscribe to baseline-h2
[info] skip: zeemo does not subscribe to static-h2
[info] skip: zeemo does not subscribe to baseline-h2c
[info] skip: zeemo does not subscribe to json-h2c
[info] skip: zeemo does not subscribe to baseline-h3
[info] skip: zeemo does not subscribe to static-h3
[info] skip: zeemo does not subscribe to gateway-64
[info] skip: zeemo does not subscribe to gateway-h3
[info] skip: zeemo does not subscribe to production-stack
[info] skip: zeemo does not subscribe to unary-grpc
[info] skip: zeemo does not subscribe to unary-grpc-tls
[info] skip: zeemo does not subscribe to stream-grpc
[info] skip: zeemo does not subscribe to stream-grpc-tls
[info] skip: zeemo does not subscribe to echo-ws
[info] skip: zeemo does not subscribe to echo-ws-pipeline
[info] rebuilding site/data/*.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/frameworks.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/current.json
[info] done
[info] restoring loopback MTU to 65536
[info] restoring CPU governor → powersave

@skylightis666
Copy link
Copy Markdown
Contributor Author

Numbers look solid — RPS held (worst -3% on limited-conn-512, otherwise ±1%) while memory dropped 10-28% across the board. Composite-with-mem moves us from #3 to #2 on limited-conn-4096 and pushes json-4096 to a perfect 150.

/benchmark -f zeemo --save

@github-actions
Copy link
Copy Markdown
Contributor

👋 /benchmark request received. A collaborator will review and approve the run.

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Results

Framework: zeemo | Test: all tests

Test Conn RPS CPU Mem Δ RPS Δ Mem
baseline 512 4,113,749 6278.0% 69MiB -0.2% ~0%
baseline 4096 4,429,132 6407.2% 110MiB -0.1% -15.4%
pipelined 512 48,531,942 6576.9% 66MiB +0.2% -4.3%
pipelined 4096 49,860,224 6407.1% 106MiB -0.1% -14.5%
limited-conn 512 2,543,420 5731.0% 70MiB -3.5% -20.5%
limited-conn 4096 2,606,127 5920.9% 130MiB -0.4% -27.0%
json 4096 2,380,824 6184.1% 204MiB +0.2% -20.6%
Full log
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   1.43ms   1.41ms   1.74ms   2.04ms   2.61ms

  13030451 requests in 5.00s, 13030635 responses
  Throughput: 2.60M req/s
  Bandwidth:  163.96MB/s
  Status codes: 2xx=13030635, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 13030470 / 13030635 responses (100.0%)
  Reconnects: 1302524
  Per-template: 4343405,4343473,4343592
  Per-template-ok: 4343405,4343473,4343592
[info] CPU 5920.9% | Mem 130MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   1.43ms   1.42ms   1.74ms   2.00ms   2.35ms

  12995674 requests in 5.00s, 12996461 responses
  Throughput: 2.60M req/s
  Bandwidth:  163.53MB/s
  Status codes: 2xx=12996461, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 12995211 / 12996461 responses (100.0%)
  Reconnects: 1299919
  Per-template: 4331695,4331800,4331716
  Per-template-ok: 4331695,4331800,4331716
[info] CPU 6071.3% | Mem 128MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   1.44ms   1.42ms   1.74ms   2.00ms   2.43ms

  12971566 requests in 5.00s, 12971186 responses
  Throughput: 2.59M req/s
  Bandwidth:  163.22MB/s
  Status codes: 2xx=12971186, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 12971153 / 12971186 responses (100.0%)
  Reconnects: 1297113
  Per-template: 4323723,4323927,4323503
  Per-template-ok: 4323723,4323926,4323503
[info] CPU 5955.2% | Mem 127MiB

=== Best: 2606127 req/s (CPU: 5920.9%, Mem: 130MiB) ===
[info] input BW: 201.32MB/s (avg template: 81 bytes)
[info] saved results/limited-conn/4096/zeemo.json
httparena-bench-zeemo
httparena-bench-zeemo

==============================================
=== zeemo / json / 4096c (tool=gcannon) ===
==============================================
[info] waiting for server...
[info] server ready

[run 1/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  25
  Templates: 7
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    772us    460us   2.00ms   2.81ms   3.39ms

  11774144 requests in 5.00s, 11772362 responses
  Throughput: 2.35M req/s
  Bandwidth:  7.88GB/s
  Status codes: 2xx=11772362, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 11773308 / 11772362 responses (100.0%)
  Reconnects: 473236
  Per-template: 1675287,1678659,1682727,1686340,1686978,1684447,1677803
  Per-template-ok: 1675287,1678659,1682727,1686340,1686978,1684447,1677803
[info] CPU 6130.2% | Mem 197MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  25
  Templates: 7
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    765us    451us   2.03ms   2.77ms   3.33ms

  11853351 requests in 5.00s, 11852674 responses
  Throughput: 2.37M req/s
  Bandwidth:  7.94GB/s
  Status codes: 2xx=11852674, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 11851156 / 11852674 responses (100.0%)
  Reconnects: 476516
  Per-template: 1686887,1689647,1693499,1697493,1698440,1695561,1689629
  Per-template-ok: 1686887,1689647,1693499,1697493,1698440,1695561,1689629
[info] CPU 6479.9% | Mem 201MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  25
  Templates: 7
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    780us    458us   2.04ms   2.84ms   3.39ms

  11905427 requests in 5.00s, 11904124 responses
  Throughput: 2.38M req/s
  Bandwidth:  7.97GB/s
  Status codes: 2xx=11904124, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 11902941 / 11904124 responses (100.0%)
  Reconnects: 479041
  Per-template: 1694199,1698229,1701966,1705937,1705131,1702013,1695466
  Per-template-ok: 1694199,1698229,1701966,1705937,1705131,1702013,1695465
[info] CPU 6184.1% | Mem 204MiB

=== Best: 2380824 req/s (CPU: 6184.1%, Mem: 204MiB) ===
[info] input BW: 113.53MB/s (avg template: 50 bytes)
[info] saved results/json/4096/zeemo.json
httparena-bench-zeemo
httparena-bench-zeemo
[info] skip: zeemo does not subscribe to json-comp
[info] skip: zeemo does not subscribe to json-tls
[info] skip: zeemo does not subscribe to upload
[info] skip: zeemo does not subscribe to api-4
[info] skip: zeemo does not subscribe to api-16
[info] skip: zeemo does not subscribe to static
[info] skip: zeemo does not subscribe to async-db
[info] skip: zeemo does not subscribe to crud
[info] skip: zeemo does not subscribe to fortunes
[info] skip: zeemo does not subscribe to baseline-h2
[info] skip: zeemo does not subscribe to static-h2
[info] skip: zeemo does not subscribe to baseline-h2c
[info] skip: zeemo does not subscribe to json-h2c
[info] skip: zeemo does not subscribe to baseline-h3
[info] skip: zeemo does not subscribe to static-h3
[info] skip: zeemo does not subscribe to gateway-64
[info] skip: zeemo does not subscribe to gateway-h3
[info] skip: zeemo does not subscribe to production-stack
[info] skip: zeemo does not subscribe to unary-grpc
[info] skip: zeemo does not subscribe to unary-grpc-tls
[info] skip: zeemo does not subscribe to stream-grpc
[info] skip: zeemo does not subscribe to stream-grpc-tls
[info] skip: zeemo does not subscribe to echo-ws
[info] skip: zeemo does not subscribe to echo-ws-pipeline
[info] rebuilding site/data/*.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/frameworks.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/json-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/current.json
[info] done
[info] restoring loopback MTU to 65536
[info] restoring CPU governor → powersave

@MDA2AV MDA2AV merged commit 893e285 into MDA2AV:main May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants