test: add multi-language coverage for eval/hygiene.py (59% -> 84%) by lukeinglis · Pull Request #166 · akashgit/remote-factory

lukeinglis · 2026-05-01T18:25:31Z

Summary

Adds 18 tests covering previously untested code paths in factory/eval/hygiene.py
Coverage: 59% -> 84% (47 lines remaining, down from 116)
All tests mock _run_cmd() to avoid requiring Node.js/Rust/Go toolchains

What's covered

Area	Tests Added
`_run_cmd()` error handling	3 (timeout, command not found, generic exception)
`eval_tests()` Node.js/Rust/Go	5 (pass/fail for each language + mixed aggregation)
`eval_tests()` scoring	2 (all-fail, partial-pass)
`eval_lint()` Node.js/Rust	4 (clean + errors for each)
`eval_lint()` scoring	2 (partial credit, floor at zero)
`eval_type_check()` Node.js	2 (clean + errors)

Remaining uncovered lines (47)

The remaining gaps are Python-specific branches that require real subprocess execution (pytest coverage parsing, mypy type-check parsing) rather than mocked outputs. These run against the actual project in CI and are covered by the existing python_project fixture tests.

Test plan

All 36 hygiene tests pass (uv run pytest tests/eval/test_hygiene.py -v)
Full suite passes: 1338 passed (uv run pytest tests/ -x -q)
No new dependencies or fixtures required

Add 18 tests covering previously untested code paths in the hygiene eval module: - _run_cmd() error handling: timeout, command not found, generic errors - eval_tests() for Node.js, Rust, and Go projects with mocked outputs - eval_tests() scoring: all-fail and partial-pass edge cases - eval_lint() for Node.js (eslint) and Rust (clippy) with clean/error - eval_lint() scoring: partial credit calculation and floor at zero - eval_type_check() for Node.js (tsc) with clean/error outputs - Mixed project aggregation across Python + Node.js All tests mock _run_cmd() to avoid requiring Node/Rust/Go toolchains. Signed-off-by: Luke Inglis <lukeinglis21@yahoo.com>

codecov · 2026-05-01T18:27:18Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.33%. Comparing base (d556bbb) to head (e97cd81).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #166      +/-   ##
==========================================
+ Coverage   86.23%   87.33%   +1.09%     
==========================================
  Files          45       45              
  Lines        6307     6307              
==========================================
+ Hits         5439     5508      +69     
+ Misses        868      799      -69

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

akashgit

Code Review

Coverage claim verified accurate (116 missed lines → 47). All 1338 tests pass, lint is clean. Only tests/eval/test_hygiene.py is modified — the 14-file diff shown by gh pr diff is a branch divergence artifact (branch is behind PRs #136 and #163). Good contribution overall, but assertion quality needs tightening before merge.

Finding 1: Weak assertion in `test_go_project_passing` — ISSUE

assert result["score"] > 0.0

With 2 "ok" lines, score should be 2/2 = 1.0. The Node and Rust passing tests both assert == 1.0 — this should too. > 0.0 would pass even if the scoring was completely wrong.

Fix: assert result["score"] == 1.0

Finding 2: Missing score assertion in `test_go_project_failing` — ISSUE

Checks passed is False and "go" in details but never checks the score. Should be 0.0 (0 passed, 1 failed).

Fix: Add assert result["score"] == 0.0

Finding 3: Missing score assertion in `test_node_typecheck_errors` — ISSUE

With 2 TS errors at 0.05 penalty each, score should be 0.9. Never checked.

Fix: Add assert result["score"] == 0.9

Finding 4: Missing score assertion in `test_node_lint_errors` — ISSUE

With 3 errors at 0.1 penalty each, score should be 0.7. Never checked.

Fix: Add assert result["score"] == 0.7

Finding 5: Missing score assertion in `test_rust_lint_errors` — ISSUE

With 2 errors at 0.1 penalty each, score should be 0.8. Never checked.

Fix: Add assert result["score"] == 0.8

Finding 6: Fragile detail string assertions — ISSUE

assert "2" in result["details"]  # matches "12 errors", "rs2", paths containing "2"

Used in multiple tests. These are too loose — "2" matches any string containing the digit 2.

Fix: Use "2 errors" in result["details"] or more specific patterns.

Finding 7: `test_mixed_project_aggregates` doesn't verify `passed` field — ISSUE

The mock has Node returning 1 failure, so passed should be False. Never checked — a bug that left passed: True with sub-project failures would be invisible.

Fix: Add assert result["passed"] is False

Branch hygiene

Branch needs rebase on current main (behind PRs #136 and #163). No conflicts on test merge, but rebase will clean up the diff.

Verdict: Do not merge yet. The tests are structurally sound but under-assert — they verify shape but not values. All 7 issues are straightforward fixes (add score assertions, tighten string matches). After fixes + rebase, this is a solid contribution.

akashgit requested changes May 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add multi-language coverage for eval/hygiene.py (59% -> 84%)#166

test: add multi-language coverage for eval/hygiene.py (59% -> 84%)#166
lukeinglis wants to merge 1 commit intoakashgit:mainfrom
lukeinglis:tests/eval-hygiene-coverage

lukeinglis commented May 1, 2026

Uh oh!

codecov Bot commented May 1, 2026

Uh oh!

akashgit left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lukeinglis commented May 1, 2026

Summary

What's covered

Remaining uncovered lines (47)

Test plan

Uh oh!

codecov Bot commented May 1, 2026

Codecov Report

Uh oh!

akashgit left a comment

Choose a reason for hiding this comment

Code Review

Finding 1: Weak assertion in test_go_project_passing — ISSUE

Finding 2: Missing score assertion in test_go_project_failing — ISSUE

Finding 3: Missing score assertion in test_node_typecheck_errors — ISSUE

Finding 4: Missing score assertion in test_node_lint_errors — ISSUE

Finding 5: Missing score assertion in test_rust_lint_errors — ISSUE

Finding 6: Fragile detail string assertions — ISSUE

Finding 7: test_mixed_project_aggregates doesn't verify passed field — ISSUE

Branch hygiene

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Finding 1: Weak assertion in `test_go_project_passing` — ISSUE

Finding 2: Missing score assertion in `test_go_project_failing` — ISSUE

Finding 3: Missing score assertion in `test_node_typecheck_errors` — ISSUE

Finding 4: Missing score assertion in `test_node_lint_errors` — ISSUE

Finding 5: Missing score assertion in `test_rust_lint_errors` — ISSUE

Finding 7: `test_mixed_project_aggregates` doesn't verify `passed` field — ISSUE