Apply a bunch of optimizations #315

dduan · 2026-02-07T05:27:12Z

Each commit contains a micro optimization.

Avoid triple-quote probing on non-quote tokens in Parser.nextToken.scanString.\n\nThe tokenizer now branches on the head code unit before running single-quote or double-quote string logic, keeping existing parse behavior while removing repeated quote checks for numeric/date/bare-token paths.\n\nAlso cache range.upperBound in a local for scan loops and trim date/value range checks to use that cached bound.

Reduce parseSelect path churn by storing only selector prefixes in tablePath. fillTablePath now returns the terminal key/hash/token directly instead of appending then removeLast. This removes one append/remove pair per selector while preserving walkTablePath semantics and selector error handling.

Add CodeUnits.isBasicStringBodyChar and use it in nextToken's double-quoted string scanner. This replaces repeated per-byte backslash/quote/newline comparisons with one table probe in both the unrolled and scalar loops.

Hoist out of the inner loops in matchKeyValue, matchKeyArray, and matchKeyTable.\n\nThis keeps lookup behavior unchanged while removing repeated count reads in hot matcher loops.

Replace long-key byte-by-byte FNV hashing with constant-time prefix/suffix sampling in fastKeyHash(bytes:range:) and fastKeyHash(_:). Keep <=8-byte packed hashing unchanged. Preserve exact-key equality checks so hash remains a prefilter only.

Change createKeyValue to accept the parsed value token and append KeyValuePair with its final value in one step. This removes temporary Token.empty staging and follow-up indexed mutation in parseKeyValue while preserving lookup and insertion behavior.

Parse the first selector segment directly in parseSelect and branch immediately on closing bracket vs dot. Single-segment selectors now skip tablePath bookkeeping and walkTablePath entirely, while dotted selectors still use the existing path fill/walk logic after seeding the first segment. This removes speculative token rollback from prior attempts and keeps key ownership behavior aligned with existing dotted-path code.

Keep the single-segment selector fast path and add a byte-token lookup path for selectors when no key transform is active. ParseSelect now hashes bare tokens directly, probes array keys without building a String, and only materializes the key when creating the array entry. Dotted selectors and transformed/quoted keys keep existing logic.

Load character-class tables only in branches that use them and split dot-special bare-key scanning into a common bare-key fast path plus plus-sign fallback. This removes per-token setup and branch work in tokenizer scan loops.

Replace the double-quoted string 8-byte classifier loop with unaligned\nUInt64 chunk probes that detect quote, backslash, and LF bytes with\nbit masks.\n\nKeep the existing bytewise fallback loop and parsing behavior unchanged,\nso only the hot scanning path changes.

Record two hash bits per inserted key on each internal table and gate table/array/value lookups with fast negative checks before linear scans.\n\nThis keeps behavior unchanged while cutting repeated miss scans during parse-time key insertion and table path traversal.

Drop optional base-address branches from hot tokenizer/hash paths where parse-time buffers are guaranteed to have storage for non-empty ranges. - remove the fallback byte-by-byte 8-byte pre-scan branch in double-quoted token scanning - short-circuit zero-length hash range and use direct base-address hashing path for non-empty ranges - fast-return empty-range string creation and use direct base-address decoding for non-empty ranges

Advance the first selector segment directly before tablePath fill/walk. - Reuse the same path-segment advancement logic for first-segment setup and subsequent walk. - Start walkTablePath from the already-resolved table/keyed state. - Keep existing two-phase selector shape while dropping one tablePath append for the first segment. Goal: reduce parseSelect work on table/array selector paths without one-pass ARC regressions.

Teach parseSelect to special-case dotted selectors with exactly two segments. After walking the first segment, parse the second segment directly; if the selector closes, skip fillTablePath and walkTablePath. Fallback for longer dotted selectors keeps the existing tablePath+walk behavior unchanged.

github-actions · 2026-02-07T05:30:06Z