Skip to content

Comments

Add capital income predictors to asset imputation QRF#546

Open
MaxGhenis wants to merge 4 commits intomainfrom
improve-asset-imputation-predictors
Open

Add capital income predictors to asset imputation QRF#546
MaxGhenis wants to merge 4 commits intomainfrom
improve-asset-imputation-predictors

Conversation

@MaxGhenis
Copy link
Contributor

Summary

  • Adds interest_income, dividend_income, and rental_income as predictors to the SIPP→CPS liquid asset imputation model
  • Capital income is strongly correlated with asset holdings and was the major missing signal
  • Updates all three imputation pathways consistently: sipp.py (model training), cps.py (CPS extraction), source_impute.py (calibration-time imputation)

Details

The QRF model imputing bank_account_assets, stock_assets, and bond_assets from SIPP 2023 to CPS previously used only:

  • employment_income, age, is_female, is_married, count_under_18

Now also uses:

  • interest_income (SIPP: TINC_BANK + TINC_BOND, annualized)
  • dividend_income (SIPP: TINC_STMF, annualized)
  • rental_income (SIPP: TINC_RENT, annualized)

These map directly to CPS variables already available in the microdata.

Motivation

With the companion PR (PolicyEngine/policyengine-us#connect-snap-assets-to-microdata) connecting snap_assets to imputed liquid assets, the SNAP asset test now binds in microsimulation. Better asset imputation is needed to avoid over/under-counting households failing the asset test when modeling BBCE elimination.

Test plan

  • Verify liquid_assets.pkl retrains with new predictors
  • Compare asset distribution before/after (expect closer match to SCF/SIPP actuals)
  • Re-run BBCE elimination analysis with improved assets
  • Existing tests pass (predictor list assertions still hold)

🤖 Generated with Claude Code

MaxGhenis and others added 4 commits February 19, 2026 23:25
The QRF model for imputing liquid assets (bank accounts, stocks, bonds)
previously used only employment_income, age, demographics. This adds
interest_income, dividend_income, and rental_income as predictors,
which are strongly correlated with asset holdings and available in
both SIPP (TINC_BANK, TINC_STMF, TINC_BOND, TINC_RENT) and CPS.

Updated in three places to keep them consistent:
- sipp.py (standalone model training)
- cps.py (CPS variable extraction)
- source_impute.py (calibration-time imputation)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant