gh-149481: skip FOR_ITER inline specialization for Python __next__#149491
gh-149481: skip FOR_ITER inline specialization for Python __next__#149491savannahostrowski merged 5 commits intopython:mainfrom
FOR_ITER inline specialization for Python __next__#149491Conversation
savannahostrowski
left a comment
There was a problem hiding this comment.
Yep, this was pretty much the exact fix I had locally. Thanks for the quick turnaround on this @NekoAsakura ❤️
I'll wait to see if anyone else wants to have a look.
markshannon
left a comment
There was a problem hiding this comment.
Looks good. Thanks for doing this.
|
Updating to fix some errors we introduced on the main branch. |
|
Thanks @NekoAsakura for the PR, and @savannahostrowski for merging it 🌮🎉.. I'm working now to backport this PR to: 3.15. |
|
GH-149523 is a backport of this pull request to the 3.15 branch. |
…_next__` (GH-149491) (#149523) gh-149481: skip `FOR_ITER` inline specialization for Python `__next__` (GH-149491) (cherry picked from commit 49918f5) Co-authored-by: Neko Asakura <neko.asakura@outlook.com> Co-authored-by: Savannah Ostrowski <savannah@python.org> Co-authored-by: Stan Ulbrych <stan@python.org>
|
Thanks for the careful bisect and the empirical fix from @savannahostrowski. ❤️
Benchmark
Root cause analyse (from opus 4.7)
When
__next__is Python-defined,tp_iternextis the genericslot_tp_iternextslot wrapper that re-enters the eval loop viavectorcall_method. Calling it through the new_ITER_NEXT_INLINE(which capturestp_iternextas a baked-in operand and lives in a tight_JUMP_TO_TOPloop) is correct functionally, but each new outer-iteration class causes_GUARD_TYPE_ITERto fail and warm a side-exit. After ~SIDE_EXIT_INITIAL_VALUE(4000) hits at each side-exit, a side trace forms there;_EXIT_TRACEjumps trace-to-trace viaTIER2_TO_TIER2, which doesn't re-check the eval-breaker. WithMAX_CHAIN_DEPTH=4traces possible, the chain accumulates ~4×4000+4002 ≈ 20k inner hits before the system spirals into a tier-2-only loop with no signal checkpoints — matching the observed exact threshold. The pre-PR_FOR_ITER_TIER_TWOpath goes through_PyForIter_VirtualIteratorNext, which uses an indirect dispatch onPy_TYPE(iter_o)->tp_iternextand does not produce a per-type guard, so no side-trace storm forms. Falling back to that path for slot iterators sidesteps the issue while keeping_ITER_NEXT_INLINEfor C-level iterators (dict, set, enumerate, zip, reversed, etc.), which the existingtest_for_iter_direct_*tests exercise.xml.etree.iterparse#149481