Skip to content

Add lint test to catch non-vectorized sum() over entity calls #7321

@MaxGhenis

Description

@MaxGhenis

Summary

Python's built-in sum() over generator expressions of entity variable calls (e.g. sum(person(v, period) for v in sources)) is not properly vectorized. The correct pattern is add(person, period, sources) from policyengine-core.

This was caught during review of #7280 — the formula used sum(person(source, period) for source in non_negative_sources) instead of add(person, period, non_negative_sources).

Proposal

Add a pytest test (or pre-commit hook) that scans all .py files under policyengine_us/variables/ and flags any use of the built-in sum() where the arguments include calls to entity variables (person(, tax_unit(, spm_unit(, household(, etc.).

This should not flag:

  • entity.sum(...) — the vectorized entity group method
  • sum([array.astype(int) for ...]) — summing pre-computed NumPy arrays (though add() is still preferred)

Could also be implemented upstream in policyengine-core to benefit all country models.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions