Summary
Python's built-in sum() over generator expressions of entity variable calls (e.g. sum(person(v, period) for v in sources)) is not properly vectorized. The correct pattern is add(person, period, sources) from policyengine-core.
This was caught during review of #7280 — the formula used sum(person(source, period) for source in non_negative_sources) instead of add(person, period, non_negative_sources).
Proposal
Add a pytest test (or pre-commit hook) that scans all .py files under policyengine_us/variables/ and flags any use of the built-in sum() where the arguments include calls to entity variables (person(, tax_unit(, spm_unit(, household(, etc.).
This should not flag:
entity.sum(...) — the vectorized entity group method
sum([array.astype(int) for ...]) — summing pre-computed NumPy arrays (though add() is still preferred)
Could also be implemented upstream in policyengine-core to benefit all country models.
Summary
Python's built-in
sum()over generator expressions of entity variable calls (e.g.sum(person(v, period) for v in sources)) is not properly vectorized. The correct pattern isadd(person, period, sources)from policyengine-core.This was caught during review of #7280 — the formula used
sum(person(source, period) for source in non_negative_sources)instead ofadd(person, period, non_negative_sources).Proposal
Add a pytest test (or pre-commit hook) that scans all
.pyfiles underpolicyengine_us/variables/and flags any use of the built-insum()where the arguments include calls to entity variables (person(,tax_unit(,spm_unit(,household(, etc.).This should not flag:
entity.sum(...)— the vectorized entity group methodsum([array.astype(int) for ...])— summing pre-computed NumPy arrays (thoughadd()is still preferred)Could also be implemented upstream in policyengine-core to benefit all country models.