Skip to content

some perf improvements#3

Open
wilaak wants to merge 3 commits into
tempestphp:mainfrom
wilaak:performance-improv
Open

some perf improvements#3
wilaak wants to merge 3 commits into
tempestphp:mainfrom
wilaak:performance-improv

Conversation

@wilaak
Copy link
Copy Markdown

@wilaak wilaak commented May 14, 2026

Fixes #1, #2.

Replaces the Lexer/LexerRules/Tokens object graph with a single-pass parsee of free functions in src/Parser.php. Output is byte-identical; 104 existing tests pass unchanged. Public API (Parser, Markdown) preserved.

The previous parser allocated approximately 1800 token objects per parse and cloned the Parser+Lexer tree once per block for inline re-parsing. The new implementation scans the source once with strcspn/strpos and writes HTML directly into a by-ref string buffer.

Benchmark (composer bench:tempest, opcache+JIT, same host):

input before after ratio
01-small 0.338 ms 0.029 ms 11.7x
02-large 5.374 ms 0.723 ms 7.4x
01-small + hl 1.088 ms 0.890 ms 1.22x
02-large + hl 26.530 ms 18.966 ms 1.40x

@brendt
Copy link
Copy Markdown
Member

brendt commented May 14, 2026

"some" 🤣

I actually started with a variant of this approach. There are two big problems with it: maintainability and extensibility.

I wonder if extensibility is worth it though. Need to think about this.

@NickSdot
Copy link
Copy Markdown

I wonder if extensibility is worth it though.

Been working a lot with external Markdown lately. Like the ones that websites now provide for agents. Many have their own MDX components. If you want to allow MDX parsing eventually, then the extensibility is probably worth it. Not sure how others consume Markdown, but I'd assume that parsing most of the time is a (near) one-time cost, thanks to result caching.

@wilaak
Copy link
Copy Markdown
Author

wilaak commented May 14, 2026

Maintainability is improved, the whole src/ could be replaced with this single file. Thousands of lines of code removed already. As for extensibility, maybe there is a middle ground here somewhere that isn't using heap allocated objects with instanceof chaning for maximum performance penalty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

It's slow

3 participants