Add AI contributor skills, per-script structure, and CI pipeline#2
Add AI contributor skills, per-script structure, and CI pipeline#2
Conversation
Defines per-script directory structure, Claude Code skills (contribute-script, validate-script), AGENTS.md contributor guide, and GitHub Actions CI for validating pyopenms script contributions in isolated venvs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mplate - Resolve PYTHONPATH vs sys.path contradiction: PYTHONPATH is primary, conftest.py provides fallback for direct pytest invocation - Add concrete conftest.py template to reduce implementation ambiguity - Clarify migration: copy files from feature branch, decompose monolithic test files and shared requirements.txt into per-script structure - Use consistent language for branch migration (copy + restructure) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…line 14 tasks covering: ruff.toml, migration of 7 scripts to per-script directories, Claude Code skills, AGENTS.md, GitHub Actions CI, and documentation updates. Reviewed and approved with minor fixes applied. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t data Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…thetic test data Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…thetic test data Fix FeatureFindingMetabo.run() API change in pyopenms >= 3.x: third argument is now output_chromatograms list, not a chrom_fwhm float. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- validate-script: isolated venv validation (ruff + pytest) for any script - contribute-script: guided workflow for creating new pyopenms tools Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detects changed script directories on PR, validates each in parallel with its own venv: pip install from script's requirements.txt, ruff lint, pytest. Uses Python 3.11 for pyopenms wheel availability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add 'run all tests' one-liner to CLAUDE.md - Fix feature_detection_proteomics README to reference FeatureFinderAlgorithmPicked (pyopenms 3.5.0+ API) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis pull request establishes a comprehensive AI-assisted contribution and validation framework for the agentomics repository. It introduces standardized directory structure, validation procedures, and CI/CD automation for stand-alone Changes
Sequence Diagram(s)sequenceDiagram
participant PR as Pull Request
participant GHA as GitHub Actions
participant Detect as detect-changes Job
participant Validate as validate Job
participant Env as Isolated venv
participant Ruff as Ruff Linter
participant Pytest as Pytest Runner
PR->>GHA: Trigger on scripts/** changes
GHA->>Detect: Run detect-changes
Detect->>Detect: Diff against base branch
Detect->>Detect: Extract script directories
Detect->>GHA: Output matrix with script_dir list
GHA->>Validate: Run validate (if has_changes)
loop For each script_dir in matrix
Validate->>Env: Create isolated venv
Validate->>Env: Install script's requirements.txt
Validate->>Env: Install pytest and ruff
Validate->>Ruff: Run ruff check on script_dir
Ruff->>Validate: Return lint results
Validate->>Pytest: Run pytest on script_dir/tests/
Pytest->>Validate: Return test results
Validate->>Env: Cleanup venv
end
Validate->>PR: Report pass/fail status
sequenceDiagram
participant Dev as Developer
participant Claude as Claude (contribute-script)
participant Repo as Repository
participant Venv as Isolated venv
participant Validate as validate-script Skill
Dev->>Claude: Run contribute-script skill
Claude->>Dev: Prompt for tool domain & name
Dev->>Claude: Provide inputs
Claude->>Repo: Create script directory structure
Claude->>Repo: Write scaffolded script.py (with pyopenms guard)
Claude->>Repo: Write requirements.txt, README.md
Claude->>Repo: Create tests/conftest.py (with skip marker)
Claude->>Repo: Write test_*.py files
Claude->>Validate: Invoke validate-script
Validate->>Venv: Create temporary venv
Validate->>Venv: Install script dependencies + pytest + ruff
Validate->>Venv: Run ruff check
Validate->>Venv: Run pytest
Venv->>Validate: Return lint and test results
Validate->>Venv: Cleanup
Validate->>Claude: Report validation outcome
Claude->>Dev: Confirm ready for commit
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
🧹 Nitpick comments (7)
scripts/proteomics/feature_detection_proteomics/requirements.txt (1)
1-1: Constrainpyopenmsto a validated range.Line 1 is unpinned; future upstream changes can break this script unexpectedly in CI and local setups.
Suggested change
-pyopenms +pyopenms>=3.5,<4🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/proteomics/feature_detection_proteomics/requirements.txt` at line 1, The requirements file currently lists an unpinned dependency "pyopenms"; update that entry in requirements.txt to a validated version or range (for example pin to the exact tested version or a bounded range like >=X.Y.Z,<A.B.C) so CI and local environments remain reproducible; after changing the "pyopenms" line, run the proteomics feature_detection_proteomics tests or the project's dependency verification job to confirm the chosen version works.scripts/proteomics/peptide_mass_calculator/requirements.txt (1)
1-1: Pinpyopenmsto a tested range for reproducibility.Line 1 is currently unbounded, so CI behavior can drift as new releases land. Please constrain this to a validated range.
Suggested change
-pyopenms +pyopenms>=3.5,<4🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/proteomics/peptide_mass_calculator/requirements.txt` at line 1, Replace the unbounded pyopenms entry in requirements.txt with a pinned, tested version range (or exact version) to ensure reproducible CI; update the single line "pyopenms" to a constrained specification such as a specific version or a range that your tests validate (for example, pyopenms>=2.7.0,<2.8.0) and ensure this matches the version used in your CI/test matrix.scripts/proteomics/feature_detection_proteomics/tests/test_feature_detection_proteomics.py (1)
32-34: Assert persisted output content, not just path existence.This test would be stronger if it verifies the written
featureXMLcan be loaded and matches the returned map size, instead of only checking that the file exists.🔧 Suggested test hardening
fm = detect_features(input_path, output_path) assert isinstance(fm, oms.FeatureMap) assert os.path.exists(output_path) + reloaded = oms.FeatureMap() + oms.FeatureXMLFile().load(output_path, reloaded) + assert reloaded.size() == fm.size()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/proteomics/feature_detection_proteomics/tests/test_feature_detection_proteomics.py` around lines 32 - 34, The test currently only checks that the output_path file exists; instead, load the written featureXML into a new oms.FeatureMap (use oms.FeatureXMLFile().load or the appropriate loader) and assert the loaded map is an oms.FeatureMap and that its feature count matches the returned fm (compare fm.size() or len(fm) to loaded_map.size() or len(loaded_map)); replace the assert os.path.exists(output_path) with these checks so the persisted content is validated against the returned fm from detect_features.scripts/metabolomics/mass_accuracy_calculator/tests/test_mass_accuracy_calculator.py (1)
9-40: Please add invalid-charge tests.Given the public API takes
charge: int, add coverage forcharge <= 0to protect against division-by-zero/regression paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/metabolomics/mass_accuracy_calculator/tests/test_mass_accuracy_calculator.py` around lines 9 - 40, Add tests that pass invalid charges (0 and a negative int) to the public APIs theoretical_mz_from_sequence and theoretical_mz_from_formula to guard against division-by-zero/regression; for each function call with charge 0 and charge -1 assert that the call raises a clear exception (e.g. ValueError) using pytest.raises, and include both sequence- and formula-based paths (theoretical_mz_from_sequence("PEPTIDEK", 0), theoretical_mz_from_sequence("PEPTIDEK", -1), theoretical_mz_from_formula("C6H12O6", 0), theoretical_mz_from_formula("C6H12O6", -1)) so the test suite ensures invalid-charge handling is enforced.scripts/proteomics/spectrum_file_info/tests/test_spectrum_file_info.py (1)
25-57: Add a regression test for spectra that have no peaks.Current tests don’t cover the edge case where spectra exist but
get_peaks()is empty. That case is important for validatingtic_per_spectrumlength andmz_rangefallback behavior.Suggested test addition
+ def test_summarise_spectra_with_no_peaks(self): + import pyopenms as oms + from spectrum_file_info import summarise_experiment + + exp = oms.MSExperiment() + for i in range(3): + spec = oms.MSSpectrum() + spec.setMSLevel(1) + spec.setRT(float(i)) + exp.addSpectrum(spec) + + summary = summarise_experiment(exp) + assert summary["n_spectra"] == 3 + assert len(summary["tic_per_spectrum"]) == 3 + assert all(v == 0.0 for v in summary["tic_per_spectrum"])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/proteomics/spectrum_file_info/tests/test_spectrum_file_info.py` around lines 25 - 57, Add a regression test that builds an MSExperiment containing spectra whose get_peaks() returns empty (i.e., spectra with no peaks), call summarise_experiment on it, then assert the experiment-level counters still report the correct number of spectra (summary["n_spectra"] == n) and that len(summary["tic_per_spectrum"]) == n to ensure TIC is recorded per spectrum, and verify the mz_range fallback is handled by asserting summary["mz_range"] is falsy (e.g., None or an empty/None tuple) when no peaks exist; reference summarise_experiment, tic_per_spectrum, get_peaks(), and mz_range to locate the behavior to test.docs/superpowers/plans/2026-03-24-ai-contributor-skills.md (1)
64-68: Consider adding language specifiers to fenced code blocks.Several code blocks (e.g., lines 66-68 for requirements.txt content) lack language specifiers. While this doesn't affect execution, adding
txtortextimproves markdown rendering and satisfies linters.📝 Example fix
- ``` + ```txt pyopenms</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/superpowers/plans/2026-03-24-ai-contributor-skills.mdaround lines 64 -
68, Update the fenced code block under "Step 3: Create requirements.txt" that
currently contains "pyopenms" so it includes a language specifier (e.g., change
totxt or ```text) to improve markdown rendering and satisfy linters;
locate the fenced block associated with the "Step 3: Create requirements.txt"
heading and add the specifier to the opening backticks.</details> </blockquote></details> <details> <summary>scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py (1)</summary><blockquote> `82-89`: **Consider using closest match instead of first match within tolerance.** The current implementation takes the first observed peak within the m/z tolerance window (line 86 `break`). If multiple observed peaks fall within the tolerance, using the closest match would be more accurate. However, for typical isotope patterns where peaks are well-separated (>1 Da), this is unlikely to cause issues. <details> <summary>♻️ Optional enhancement for closest-match selection</summary> ```diff for tmz, tint in theoretical: matched = 0.0 + best_delta = float('inf') for omz, oint in observed: - if abs(omz - tmz) <= mz_tolerance: + delta = abs(omz - tmz) + if delta <= mz_tolerance and delta < best_delta: + best_delta = delta matched = oint - break theo_vec.append(tint) obs_vec.append(matched) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py` around lines 82 - 89, The loop that matches theoretical peaks to observed picks the first observed peak within mz_tolerance (variables tmz/tint and omz/oint with the early break), which can select a non-closest match when multiple observed peaks fall inside the window; change the logic in the matching block so you search all observed peaks for the minimum abs(omz - tmz) and only accept it if that minimum <= mz_tolerance (leave matched = 0.0 if none found), then append tint and the chosen matched intensity to theo_vec and obs_vec — update the matching code that currently sets matched and uses break to instead compute and use the closest match. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/skills/contribute-script.md:
- Around line 75-80: The snippet uses sys.exit in the ImportError handler but
never imports sys, which will raise NameError on failure; add an import sys at
the top of the block so the try/except around importing pyopenms (import
pyopenms as oms) can call sys.exit(...) safely when ImportError is caught.In @.claude/skills/validate-script.md:
- Around line 21-30: Add a trap-based cleanup to guarantee removal of the
temporary virtualenv: immediately after VENV_DIR=$(mktemp -d) register trap 'rm
-rf "$VENV_DIR"' EXIT (so it runs on any exit/failure) and delete the final
explicit rm -rf "$VENV_DIR" line at the end; ensure the trap references the same
VENV_DIR variable and uses the EXIT signal so the venv is removed on both
success and failure.In @.github/workflows/validate.yml:
- Around line 24-27: The CHANGED assignment pipeline can fail when grep finds no
matches; modify the command that pipes to grep (the part using grep -oP
'scripts/[^/]+/[^/]+/') to add a fallback (append || true) so grep's non-zero
exit doesn't break the step, ensuring CHANGED becomes an empty list instead of
causing the job to fail; update the pipeline around the CHANGED variable
assignment and keep the rest of the processing (sort -u | jq -R -s -c ...)
unchanged.In
@docs/superpowers/specs/2026-03-24-ai-contributor-skills-design.md:
- Around line 28-37: The conftest.py template currently combines imports on one
line; update the template so imports follow PEP8/ruff isort style by splitting
"import sys, os, pytest" into separate, ordered imports (import os; import sys;
import pytest) with appropriate blank lines, keeping the existing
sys.path.insert(0, os.path.join(os.path.dirname(file), "..")) and preserving
the try/except that sets HAS_PYOPENMS; modify the template's import block to
match the actual conftest.py pattern used in
scripts/metabolomics/metabolite_feature_detection/tests/conftest.py and ensure
symbols like HAS_PYOPENMS and sys.path.insert remain unchanged.In
@scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py:
- Around line 53-56: The code computes max_ab = max(ab for _, ab in peaks) and
divides by it, which can raise ZeroDivisionError if all intensities are zero;
fix by checking if max_ab == 0 and in that case return the peaks with normalized
intensity 0.0 (e.g. return [(mz, 0.0) for mz, _ in peaks]) instead of performing
the division; update the branch that currently builds the normalized list (the
list comprehension using mz, ab in peaks) to handle this zero-max_ab case.In
@scripts/metabolomics/mass_accuracy_calculator/mass_accuracy_calculator.py:
- Around line 37-55: Validate and reject non-positive charge values to avoid
divide-by-zero: in the importable helper theoretical_mz_from_sequence(sequence:
str, charge: int) add an explicit guard that raises a clear exception (e.g.,
ValueError) when charge <= 0, and apply the same validation to any other helper
functions that compute (mass + charge * PROTON) / charge; additionally enforce
and validate the CLI argument for --charge so it cannot accept 0 or negative
values (use argparse type/check or a custom validator) and return a helpful
message for users when an invalid charge is provided.In
@scripts/metabolomics/metabolite_feature_detection/metabolite_feature_detection.py:
- Around line 118-122: The current output default logic (declared in
parser.add_argument("--output", ...)) incorrectly documents ".featureXML"
and uses a brittle string replace that fails for inputs without a .mzML
extension or with different casing; change the code that computes the default
output (where args.output is set/constructed around line 139) to use
pathlib.Path: derive output by taking Path(args.input), if path.suffix.lower()
== ".mzml" replace the suffix with "_metabolites.featureXML", otherwise append
"_metabolites.featureXML" to the stem; also update the parser help text to
reflect the actual default naming (e.g., "(default:
_metabolites.featureXML)").In
@scripts/proteomics/feature_detection_proteomics/feature_detection_proteomics.py:
- Around line 17-22: The module currently calls sys.exit() in the module-level
try/except that imports pyopenms (the try: import pyopenms as oms except
ImportError: sys.exit(...)), which will terminate the interpreter on import;
instead either move the pyopenms import and dependency check into the main()
function so missing dependency only impacts CLI invocation, or replace the
sys.exit() with raising ImportError (e.g., re-raise or raise ImportError with
the same message) so importers/tests can catch it; update the import usage to
reference oms after the deferred import or ensure tests that import this module
can handle the ImportError from the module-level raise.- Line 93: The default output path calculation for output_path (using
args.output or args.input.replace(".mzML", ".featureXML")) is case-sensitive and
can produce the same path as args.input (e.g., "sample.mzml" or no matching
suffix), risking overwrite; update the logic that sets output_path in
feature_detection_proteomics.py to perform a case-insensitive extension swap
(use os.path.splitext on args.input and append ".featureXML") and ensure the
resulting output_path is different from args.input (if identical, add a suffix
like ".featureXML" or ".featureXML.1" to avoid clobbering); reference the
output_path assignment where args.output and args.input are used.In
@scripts/proteomics/peptide_mass_calculator/peptide_mass_calculator.py:
- Around line 32-59: In peptide_masses, validate the charge parameter before
performing m/z calculations: in the function peptide_masses(sequence: str,
charge: int = 1) check that charge is a positive integer (e.g. if not
isinstance(charge, int) or charge <= 0) and raise a clear ValueError such as
"charge must be a positive integer" to prevent ZeroDivisionError when computing
mz_monoisotopic and mz_average using PROTON.- Around line 22-27: Current module-level sys.exit on ImportError terminates the
whole test run; replace the immediate exit with deferred error handling by
capturing the optional import (set oms = None) and adding a helper function
_require_pyopenms() that attempts to import/validate pyopenms or raises
ImportError with the same install message; then call _require_pyopenms() at the
start of the public functions peptide_masses() and fragment_ions() so the
ImportError is raised at runtime and the@requires_pyopenmstest decorator can
skip tests instead of killing the process.In
@scripts/proteomics/protein_digest/protein_digest.py:
- Around line 38-44: The digest_protein function currently allows invalid
digestion params; add explicit validation at the top of digest_protein to ensure
missed_cleavages is a non-negative integer and min_length and max_length are
positive integers with min_length <= max_length (raise ValueError with clear
message on invalid input), and clamp or reject values outside sensible bounds;
apply the same validation logic to the other digestion functions that accept the
same parameters (any other functions that take missed_cleavages, min_length,
max_length in this module) so invalid inputs are caught before digestion logic
runs.- Around line 62-64: The docstring describing the returned list-of-dicts is out
of sync: it mentions keysstart,end, andmissed_cleavageswhile the
actual payload containssequence,length, andmonoisotopic_mass. Update
the function's return docstring (the docstring block that currently lists
sequence,start,end,monoisotopic_mass,missed_cleavages) to
accurately describe the actual returned keys (sequence,length,
monoisotopic_mass) or alternatively modify the function that builds the dicts
so they includestart,end, andmissed_cleavages; ensure the docstring and
the returned dict structure (produced by the code that constructs each peptide
dict) match exactly.In
@scripts/proteomics/spectrum_file_info/spectrum_file_info.py:
- Around line 4-5: The docstring/help claims support for "mzML (or mzXML)" but
the loader call MzMLFile().load(...) only handles mzML; update the code to make
the behavior match the docs by either changing the top-level help text to "mzML
only" or adding explicit mzXML handling: detect the input file extension inside
spectrum_file_info.py and call the appropriate loader (e.g., use
MzXMLFile().load(...) for .mzXML files) so that the loader selection (MzMLFile
vs MzXMLFile) aligns with the advertised formats.- Around line 56-70: summarise_experiment currently skips spectra with zero
peaks which misaligns tic_per_spectrum and can leave mz_min/mz_max as inf/-inf;
change the loop that calls spec.get_peaks() so you always append a TIC value
(append 0.0 when intensities is empty) and only update mz_min/mz_max when peaks
exist, then after the loop replace any leftover inf/-inf mz bounds with a safe
default (e.g., (0.0, 0.0) or None) before returning; apply the same fix in the
duplicate block around the second occurrence (lines noted as 124-126) so
functions/methods summarise_experiment, spec.get_peaks(), mz_min, mz_max, and
tic_per_spectrum are updated accordingly.
Nitpick comments:
In@docs/superpowers/plans/2026-03-24-ai-contributor-skills.md:
- Around line 64-68: Update the fenced code block under "Step 3: Create
requirements.txt" that currently contains "pyopenms" so it includes a language
specifier (e.g., changetotxt or ```text) to improve markdown rendering
and satisfy linters; locate the fenced block associated with the "Step 3: Create
requirements.txt" heading and add the specifier to the opening backticks.In
@scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py:
- Around line 82-89: The loop that matches theoretical peaks to observed picks
the first observed peak within mz_tolerance (variables tmz/tint and omz/oint
with the early break), which can select a non-closest match when multiple
observed peaks fall inside the window; change the logic in the matching block so
you search all observed peaks for the minimum abs(omz - tmz) and only accept it
if that minimum <= mz_tolerance (leave matched = 0.0 if none found), then append
tint and the chosen matched intensity to theo_vec and obs_vec — update the
matching code that currently sets matched and uses break to instead compute and
use the closest match.In
@scripts/metabolomics/mass_accuracy_calculator/tests/test_mass_accuracy_calculator.py:
- Around line 9-40: Add tests that pass invalid charges (0 and a negative int)
to the public APIs theoretical_mz_from_sequence and theoretical_mz_from_formula
to guard against division-by-zero/regression; for each function call with charge
0 and charge -1 assert that the call raises a clear exception (e.g. ValueError)
using pytest.raises, and include both sequence- and formula-based paths
(theoretical_mz_from_sequence("PEPTIDEK", 0),
theoretical_mz_from_sequence("PEPTIDEK", -1),
theoretical_mz_from_formula("C6H12O6", 0),
theoretical_mz_from_formula("C6H12O6", -1)) so the test suite ensures
invalid-charge handling is enforced.In
@scripts/proteomics/feature_detection_proteomics/requirements.txt:
- Line 1: The requirements file currently lists an unpinned dependency
"pyopenms"; update that entry in requirements.txt to a validated version or
range (for example pin to the exact tested version or a bounded range like=X.Y.Z,<A.B.C) so CI and local environments remain reproducible; after changing
the "pyopenms" line, run the proteomics feature_detection_proteomics tests or
the project's dependency verification job to confirm the chosen version works.In
@scripts/proteomics/feature_detection_proteomics/tests/test_feature_detection_proteomics.py:
- Around line 32-34: The test currently only checks that the output_path file
exists; instead, load the written featureXML into a new oms.FeatureMap (use
oms.FeatureXMLFile().load or the appropriate loader) and assert the loaded map
is an oms.FeatureMap and that its feature count matches the returned fm (compare
fm.size() or len(fm) to loaded_map.size() or len(loaded_map)); replace the
assert os.path.exists(output_path) with these checks so the persisted content is
validated against the returned fm from detect_features.In
@scripts/proteomics/peptide_mass_calculator/requirements.txt:
- Line 1: Replace the unbounded pyopenms entry in requirements.txt with a
pinned, tested version range (or exact version) to ensure reproducible CI;
update the single line "pyopenms" to a constrained specification such as a
specific version or a range that your tests validate (for example,
pyopenms>=2.7.0,<2.8.0) and ensure this matches the version used in your CI/test
matrix.In
@scripts/proteomics/spectrum_file_info/tests/test_spectrum_file_info.py:
- Around line 25-57: Add a regression test that builds an MSExperiment
containing spectra whose get_peaks() returns empty (i.e., spectra with no
peaks), call summarise_experiment on it, then assert the experiment-level
counters still report the correct number of spectra (summary["n_spectra"] == n)
and that len(summary["tic_per_spectrum"]) == n to ensure TIC is recorded per
spectrum, and verify the mz_range fallback is handled by asserting
summary["mz_range"] is falsy (e.g., None or an empty/None tuple) when no peaks
exist; reference summarise_experiment, tic_per_spectrum, get_peaks(), and
mz_range to locate the behavior to test.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Organization UI **Review profile**: CHILL **Plan**: Pro **Run ID**: `99ad1eec-e6f7-4632-9c4b-a6b7038cfb99` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between a7ade618be33016aa93f8fed05ea900091fbf767 and 6ca1237b84940522ea34f992d9d1f1c108cb0b5e. </details> <details> <summary>📒 Files selected for processing (44)</summary> * `.claude/skills/contribute-script.md` * `.claude/skills/validate-script.md` * `.github/workflows/validate.yml` * `AGENTS.md` * `CLAUDE.md` * `README.md` * `docs/superpowers/plans/2026-03-24-ai-contributor-skills.md` * `docs/superpowers/specs/2026-03-24-ai-contributor-skills-design.md` * `ruff.toml` * `scripts/metabolomics/isotope_pattern_matcher/README.md` * `scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py` * `scripts/metabolomics/isotope_pattern_matcher/requirements.txt` * `scripts/metabolomics/isotope_pattern_matcher/tests/conftest.py` * `scripts/metabolomics/isotope_pattern_matcher/tests/test_isotope_pattern_matcher.py` * `scripts/metabolomics/mass_accuracy_calculator/README.md` * `scripts/metabolomics/mass_accuracy_calculator/mass_accuracy_calculator.py` * `scripts/metabolomics/mass_accuracy_calculator/requirements.txt` * `scripts/metabolomics/mass_accuracy_calculator/tests/conftest.py` * `scripts/metabolomics/mass_accuracy_calculator/tests/test_mass_accuracy_calculator.py` * `scripts/metabolomics/metabolite_feature_detection/README.md` * `scripts/metabolomics/metabolite_feature_detection/metabolite_feature_detection.py` * `scripts/metabolomics/metabolite_feature_detection/requirements.txt` * `scripts/metabolomics/metabolite_feature_detection/tests/conftest.py` * `scripts/metabolomics/metabolite_feature_detection/tests/test_metabolite_feature_detection.py` * `scripts/proteomics/feature_detection_proteomics/README.md` * `scripts/proteomics/feature_detection_proteomics/feature_detection_proteomics.py` * `scripts/proteomics/feature_detection_proteomics/requirements.txt` * `scripts/proteomics/feature_detection_proteomics/tests/conftest.py` * `scripts/proteomics/feature_detection_proteomics/tests/test_feature_detection_proteomics.py` * `scripts/proteomics/peptide_mass_calculator/README.md` * `scripts/proteomics/peptide_mass_calculator/peptide_mass_calculator.py` * `scripts/proteomics/peptide_mass_calculator/requirements.txt` * `scripts/proteomics/peptide_mass_calculator/tests/conftest.py` * `scripts/proteomics/peptide_mass_calculator/tests/test_peptide_mass_calculator.py` * `scripts/proteomics/protein_digest/README.md` * `scripts/proteomics/protein_digest/protein_digest.py` * `scripts/proteomics/protein_digest/requirements.txt` * `scripts/proteomics/protein_digest/tests/conftest.py` * `scripts/proteomics/protein_digest/tests/test_protein_digest.py` * `scripts/proteomics/spectrum_file_info/README.md` * `scripts/proteomics/spectrum_file_info/requirements.txt` * `scripts/proteomics/spectrum_file_info/spectrum_file_info.py` * `scripts/proteomics/spectrum_file_info/tests/conftest.py` * `scripts/proteomics/spectrum_file_info/tests/test_spectrum_file_info.py` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| ```python | ||
| try: | ||
| import pyopenms as oms | ||
| except ImportError: | ||
| sys.exit("pyopenms is required. Install it with: pip install pyopenms") | ||
| ``` |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether script modules using sys.exit also import sys.
# Expected: no output (or review any printed files).
fd -e py . scripts | while read -r f; do
if rg -n '\bsys\.exit\s*\(' "$f" >/dev/null; then
if ! rg -n '^\s*import\s+sys\b|^\s*from\s+sys\s+import\b' "$f" >/dev/null; then
echo "Missing import sys: $f"
fi
fi
doneRepository: OpenMS/agentomics
Length of output: 43
🏁 Script executed:
cat -n .claude/skills/contribute-script.md | sed -n '70,85p'Repository: OpenMS/agentomics
Length of output: 773
🏁 Script executed:
cat -n .claude/skills/contribute-script.md | sed -n '60,90p'Repository: OpenMS/agentomics
Length of output: 1231
Add import sys to the pyopenms import-guard pattern.
The documented pattern at lines 75–80 uses sys.exit() without showing import sys. Contributors copying this exact snippet will encounter a NameError on the import failure path.
Suggested doc fix
- pyopenms import guard:
```python
+ import sys
try:
import pyopenms as oms
except ImportError:
sys.exit("pyopenms is required. Install it with: pip install pyopenms")</details>
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.claude/skills/contribute-script.md around lines 75 - 80, The snippet uses
sys.exit in the ImportError handler but never imports sys, which will raise
NameError on failure; add an import sys at the top of the block so the
try/except around importing pyopenms (import pyopenms as oms) can call
sys.exit(...) safely when ImportError is caught.
| ```bash | ||
| SCRIPT_DIR=<path-to-script-directory> | ||
| VENV_DIR=$(mktemp -d) | ||
| python -m venv "$VENV_DIR" | ||
| "$VENV_DIR/bin/python" -m pip install -r "$SCRIPT_DIR/requirements.txt" | ||
| "$VENV_DIR/bin/python" -m pip install pytest ruff | ||
| "$VENV_DIR/bin/python" -m ruff check "$SCRIPT_DIR/" | ||
| PYTHONPATH="$SCRIPT_DIR" "$VENV_DIR/bin/python" -m pytest "$SCRIPT_DIR/tests/" -v | ||
| rm -rf "$VENV_DIR" | ||
| ``` |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Demonstrate cleanup behavior with and without trap.
# Expected: without trap, temp dir remains after failure; with trap, it is removed.
set -euo pipefail
echo "Without trap:"
if bash -c 'set -e; V=$(mktemp -d); false; rm -rf "$V"' 2>/dev/null; then :; fi
# Can't read V from subshell directly; show equivalent reproducible pattern:
bash -c 'set -e; V=$(mktemp -d); echo "$V" > /tmp/venv_path_no_trap; false; rm -rf "$V"' || true
V1=$(cat /tmp/venv_path_no_trap)
if [ -d "$V1" ]; then echo "Leaked: $V1"; rm -rf "$V1"; fi
echo "With trap:"
bash -c 'set -e; V=$(mktemp -d); echo "$V" > /tmp/venv_path_trap; trap "rm -rf \"$V\"" EXIT; false' || true
V2=$(cat /tmp/venv_path_trap)
if [ -d "$V2" ]; then echo "Unexpected leak: $V2"; else echo "Cleaned as expected"; fiRepository: OpenMS/agentomics
Length of output: 133
🏁 Script executed:
# Check if the file exists and read the relevant lines
if [ -f ".claude/skills/validate-script.md" ]; then
echo "=== File exists, reading lines 15-40 for context ==="
sed -n '15,40p' ".claude/skills/validate-script.md" | cat -n
else
echo "File not found at .claude/skills/validate-script.md"
# Search for similar files
find . -name "*validate*script*" -o -name "*.md" -path "*claude*" 2>/dev/null | head -20
fiRepository: OpenMS/agentomics
Length of output: 1064
🏁 Script executed:
# Get actual line numbers and check for line 34 reference
wc -l ".claude/skills/validate-script.md"
echo "=== Lines 20-35 (to see full context and line 34) ==="
sed -n '20,35p' ".claude/skills/validate-script.md" | cat -nRepository: OpenMS/agentomics
Length of output: 886
Add trap-based cleanup to ensure venv removal on failure.
The current code does not guarantee cleanup if any command fails before the final rm -rf. Step 5 (line 34) explicitly requires cleanup even when validation fails. Add trap 'rm -rf "$VENV_DIR"' EXIT after VENV_DIR=$(mktemp -d) and remove the final rm -rf "$VENV_DIR" line.
Suggested command block update
```bash
SCRIPT_DIR=<path-to-script-directory>
VENV_DIR=$(mktemp -d)
+ trap 'rm -rf "$VENV_DIR"' EXIT
python -m venv "$VENV_DIR"
"$VENV_DIR/bin/python" -m pip install -r "$SCRIPT_DIR/requirements.txt"
"$VENV_DIR/bin/python" -m pip install pytest ruff
"$VENV_DIR/bin/python" -m ruff check "$SCRIPT_DIR/"
PYTHONPATH="$SCRIPT_DIR" "$VENV_DIR/bin/python" -m pytest "$SCRIPT_DIR/tests/" -v
- rm -rf "$VENV_DIR"</details>
Also applies to: 34-34
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.
In @.claude/skills/validate-script.md around lines 21 - 30, Add a trap-based
cleanup to guarantee removal of the temporary virtualenv: immediately after
VENV_DIR=$(mktemp -d) register trap 'rm -rf "$VENV_DIR"' EXIT (so it runs on any
exit/failure) and delete the final explicit rm -rf "$VENV_DIR" line at the end;
ensure the trap references the same VENV_DIR variable and uses the EXIT signal
so the venv is removed on both success and failure.
</details>
<!-- fingerprinting:phantom:poseidon:hawk -->
<!-- This is an auto-generated comment by CodeRabbit -->
| CHANGED=$(git diff --name-only origin/${{ github.base_ref }}...HEAD -- 'scripts/' \ | ||
| | grep -oP 'scripts/[^/]+/[^/]+/' \ | ||
| | sort -u \ | ||
| | jq -R -s -c 'split("\n") | map(select(length > 0))') |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Reproduce grep no-match behavior under bash -e in a command-substitution pipeline.
# Expected: first block fails; second block with `|| true` succeeds.
set -euo pipefail
echo "a/b/c.txt" > /tmp/changed_paths.txt
echo "Case 1 (without || true) - should fail:"
if bash -e -c 'X=$(cat /tmp/changed_paths.txt | grep -oP "scripts/[^/]+/[^/]+/" | sort -u); echo "$X"'; then
echo "Unexpected success"
else
echo "Failed as expected"
fi
echo "Case 2 (with || true) - should succeed:"
bash -e -c 'X=$(cat /tmp/changed_paths.txt | grep -oP "scripts/[^/]+/[^/]+/" || true); echo "X=${X:-<empty>}"'Repository: OpenMS/agentomics
Length of output: 170
🏁 Script executed:
cat -n .github/workflows/validate.yml | head -50Repository: OpenMS/agentomics
Length of output: 1897
🏁 Script executed:
git ls-files '.github/workflows/'Repository: OpenMS/agentomics
Length of output: 91
🌐 Web query:
GitHub Actions bash run shell set -e default behavior
💡 Result:
In GitHub Actions, for run steps on Linux/macOS runners (non-Windows), the default shell behavior when no shell is explicitly specified is to use bash -e {0} if bash is available, or sh -e {0} otherwise. The -e flag enables set -e semantics, causing the script to exit immediately if any command returns a non-zero exit code (fail-fast behavior). When shell: bash is explicitly specified, it uses bash --noprofile --norc -eo pipefail {0}, which includes set -e, set -o pipefail (fail on pipeline failure), and suppresses profile/rc loading. This is the default to enforce fail-fast behavior in CI scripts, ensuring steps fail early on errors. To disable, use shell: bash {0} or set +e in the script. On Windows, the default is PowerShell, not bash.
Citations:
- 1: https://stackoverflow.com/questions/75419587/does-a-github-action-step-use-set-e-semantics-by-default
- 2: https://stackoverflow.com/questions/73066461/github-actions-why-an-intermediate-command-failure-in-shell-script-would-cause/73069669
- 3: https://github.com/actions/runner/blob/main/docs/adrs/0277-run-action-shell-options.md
- 4: https://docs.github.com/actions/using-workflows/workflow-syntax-for-github-actions
- 5: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions
🏁 Script executed:
fd -t f '.github/workflows/' -name '*.yml' -o -name '*.yaml' | xargs grep -l 'git diff' | head -5Repository: OpenMS/agentomics
Length of output: 291
🏁 Script executed:
#!/bin/bash
# Test grep behavior with pipefail (which GitHub Actions uses by default)
echo "a/b/c.txt" > /tmp/changed_paths.txt
echo "Case 1 (bash -eo pipefail, without || true) - should fail:"
if bash -eo pipefail -c 'X=$(cat /tmp/changed_paths.txt | grep -oP "scripts/[^/]+/[^/]+/" | sort -u); echo "$X"'; then
echo "Unexpected success"
else
echo "Failed as expected (exit code: $?)"
fi
echo ""
echo "Case 2 (bash -eo pipefail, with || true) - should succeed:"
if bash -eo pipefail -c 'X=$(cat /tmp/changed_paths.txt | grep -oP "scripts/[^/]+/[^/]+/" || true | sort -u); echo "X=${X:-<empty>}"'; then
echo "Succeeded as expected"
else
echo "Unexpected failure"
fiRepository: OpenMS/agentomics
Length of output: 245
🏁 Script executed:
fd '.github/workflows' -type fRepository: OpenMS/agentomics
Length of output: 231
🏁 Script executed:
find .github/workflows -type f -name '*.yml' -o -name '*.yaml'Repository: OpenMS/agentomics
Length of output: 91
🏁 Script executed:
grep -r "grep -oP" .github/workflows/Repository: OpenMS/agentomics
Length of output: 139
Add fallback to grep to prevent pipeline failure on zero matches.
In GitHub Actions, the default shell (bash -eo pipefail) will fail the entire step if grep finds no matches and returns exit code 1. When no script directories change, this pipeline fails before reaching the error-handling check at line 29. Add || true after grep to ensure the pipeline continues.
Suggested fix
CHANGED=$(git diff --name-only origin/${{ github.base_ref }}...HEAD -- 'scripts/' \
| grep -oP 'scripts/[^/]+/[^/]+/' \
+ || true \
| sort -u \
| jq -R -s -c 'split("\n") | map(select(length > 0))')🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/validate.yml around lines 24 - 27, The CHANGED assignment
pipeline can fail when grep finds no matches; modify the command that pipes to
grep (the part using grep -oP 'scripts/[^/]+/[^/]+/') to add a fallback (append
|| true) so grep's non-zero exit doesn't break the step, ensuring CHANGED
becomes an empty list instead of causing the job to fail; update the pipeline
around the CHANGED variable assignment and keep the rest of the processing (sort
-u | jq -R -s -c ...) unchanged.
| ```python | ||
| import sys, os, pytest | ||
| sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..")) | ||
| try: | ||
| import pyopenms | ||
| HAS_PYOPENMS = True | ||
| except ImportError: | ||
| HAS_PYOPENMS = False | ||
| requires_pyopenms = pytest.mark.skipif(not HAS_PYOPENMS, reason="pyopenms not installed") | ||
| ``` |
There was a problem hiding this comment.
Inconsistency: conftest.py template differs from actual implementation.
The template shows import sys, os, pytest on a single line (line 29), which violates PEP 8 and the ruff isort rules configured in ruff.toml. The actual implementation in scripts/metabolomics/metabolite_feature_detection/tests/conftest.py correctly separates imports.
Update the template to match the actual implementation pattern:
📝 Proposed fix
- ```python
- import sys, os, pytest
- sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
+ ```python
+ import os
+ import sys
+
+ import pytest
+
+ sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ```python | |
| import sys, os, pytest | |
| sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..")) | |
| try: | |
| import pyopenms | |
| HAS_PYOPENMS = True | |
| except ImportError: | |
| HAS_PYOPENMS = False | |
| requires_pyopenms = pytest.mark.skipif(not HAS_PYOPENMS, reason="pyopenms not installed") | |
| ``` |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/superpowers/specs/2026-03-24-ai-contributor-skills-design.md` around
lines 28 - 37, The conftest.py template currently combines imports on one line;
update the template so imports follow PEP8/ruff isort style by splitting "import
sys, os, pytest" into separate, ordered imports (import os; import sys; import
pytest) with appropriate blank lines, keeping the existing sys.path.insert(0,
os.path.join(os.path.dirname(__file__), "..")) and preserving the try/except
that sets HAS_PYOPENMS; modify the template's import block to match the actual
conftest.py pattern used in
scripts/metabolomics/metabolite_feature_detection/tests/conftest.py and ensure
symbols like HAS_PYOPENMS and sys.path.insert remain unchanged.
| if not peaks: | ||
| return [] | ||
| max_ab = max(ab for _, ab in peaks) | ||
| return [(mz, ab / max_ab * 100) for mz, ab in peaks] |
There was a problem hiding this comment.
Potential division by zero when all peak intensities are zero.
If max_ab is 0 (all peaks have zero intensity), line 56 will raise a ZeroDivisionError. While rare, this could occur with unusual formula inputs.
🛡️ Proposed fix
if not peaks:
return []
max_ab = max(ab for _, ab in peaks)
+ if max_ab == 0:
+ return [(mz, 0.0) for mz, _ in peaks]
return [(mz, ab / max_ab * 100) for mz, ab in peaks]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if not peaks: | |
| return [] | |
| max_ab = max(ab for _, ab in peaks) | |
| return [(mz, ab / max_ab * 100) for mz, ab in peaks] | |
| if not peaks: | |
| return [] | |
| max_ab = max(ab for _, ab in peaks) | |
| if max_ab == 0: | |
| return [(mz, 0.0) for mz, _ in peaks] | |
| return [(mz, ab / max_ab * 100) for mz, ab in peaks] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py`
around lines 53 - 56, The code computes max_ab = max(ab for _, ab in peaks) and
divides by it, which can raise ZeroDivisionError if all intensities are zero;
fix by checking if max_ab == 0 and in that case return the peaks with normalized
intensity 0.0 (e.g. return [(mz, 0.0) for mz, _ in peaks]) instead of performing
the division; update the branch that currently builds the normalized list (the
list comprehension using mz, ab in peaks) to handle this zero-max_ab case.
| def peptide_masses(sequence: str, charge: int = 1) -> dict: | ||
| """Return monoisotopic and average masses for the given peptide sequence. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| sequence: | ||
| Amino acid sequence, optionally with bracket-enclosed modifications, | ||
| e.g. ``"PEPTM[147]IDEK"``. | ||
| charge: | ||
| Desired charge state for m/z calculation (default 1). | ||
|
|
||
| Returns | ||
| ------- | ||
| dict | ||
| Dictionary with keys ``monoisotopic_mass``, ``average_mass``, | ||
| ``mz_monoisotopic``, and ``mz_average``. | ||
| """ | ||
| aa_seq = oms.AASequence.fromString(sequence) | ||
| mono = aa_seq.getMonoWeight() | ||
| avg = aa_seq.getAverageWeight() | ||
| return { | ||
| "sequence": sequence, | ||
| "charge": charge, | ||
| "monoisotopic_mass": mono, | ||
| "average_mass": avg, | ||
| "mz_monoisotopic": (mono + charge * PROTON) / charge, | ||
| "mz_average": (avg + charge * PROTON) / charge, | ||
| } |
There was a problem hiding this comment.
Validate charge before m/z calculation.
Line 57 and Line 58 divide by charge without bounds checks; --charge 0 will crash with ZeroDivisionError.
Suggested fix
def peptide_masses(sequence: str, charge: int = 1) -> dict:
@@
- aa_seq = oms.AASequence.fromString(sequence)
+ if charge <= 0:
+ raise ValueError("charge must be >= 1")
+ aa_seq = oms.AASequence.fromString(sequence)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/proteomics/peptide_mass_calculator/peptide_mass_calculator.py` around
lines 32 - 59, In peptide_masses, validate the charge parameter before
performing m/z calculations: in the function peptide_masses(sequence: str,
charge: int = 1) check that charge is a positive integer (e.g. if not
isinstance(charge, int) or charge <= 0) and raise a clear ValueError such as
"charge must be a positive integer" to prevent ZeroDivisionError when computing
mz_monoisotopic and mz_average using PROTON.
| def digest_protein( | ||
| sequence: str, | ||
| enzyme: str = "Trypsin", | ||
| missed_cleavages: int = 0, | ||
| min_length: int = 6, | ||
| max_length: int = 40, | ||
| ) -> list: |
There was a problem hiding this comment.
Missing boundary validation for digestion parameters.
missed_cleavages, min_length, and max_length accept invalid values (e.g., negatives or min_length > max_length) and are passed through directly. Add explicit validation before digestion.
Suggested fix
@@
def digest_protein(
@@
) -> list:
@@
+ if missed_cleavages < 0:
+ raise ValueError("missed_cleavages must be >= 0")
+ if min_length < 1 or max_length < 1:
+ raise ValueError("min_length and max_length must be >= 1")
+ if min_length > max_length:
+ raise ValueError("min_length must be <= max_length")
+
protein_seq = oms.AASequence.fromString(sequence)
@@
args = parser.parse_args()
@@
if not args.sequence:
parser.error("--sequence is required unless --list-enzymes is used.")
+ if args.missed_cleavages < 0:
+ parser.error("--missed-cleavages must be >= 0.")
+ if args.min_length < 1 or args.max_length < 1:
+ parser.error("--min-length and --max-length must be >= 1.")
+ if args.min_length > args.max_length:
+ parser.error("--min-length must be <= --max-length.")Also applies to: 66-73, 101-121, 137-146
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/proteomics/protein_digest/protein_digest.py` around lines 38 - 44,
The digest_protein function currently allows invalid digestion params; add
explicit validation at the top of digest_protein to ensure missed_cleavages is a
non-negative integer and min_length and max_length are positive integers with
min_length <= max_length (raise ValueError with clear message on invalid input),
and clamp or reject values outside sensible bounds; apply the same validation
logic to the other digestion functions that accept the same parameters (any
other functions that take missed_cleavages, min_length, max_length in this
module) so invalid inputs are caught before digestion logic runs.
| list of dict | ||
| Each entry contains ``sequence``, ``start``, ``end``, | ||
| ``monoisotopic_mass``, and ``missed_cleavages``. |
There was a problem hiding this comment.
Return docstring is out of sync with actual payload.
Line [63]-[64] documents start, end, and missed_cleavages, but returned dicts only contain sequence, length, and monoisotopic_mass.
Also applies to: 79-83
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/proteomics/protein_digest/protein_digest.py` around lines 62 - 64,
The docstring describing the returned list-of-dicts is out of sync: it mentions
keys `start`, `end`, and `missed_cleavages` while the actual payload contains
`sequence`, `length`, and `monoisotopic_mass`. Update the function's return
docstring (the docstring block that currently lists `sequence`, `start`, `end`,
`monoisotopic_mass`, `missed_cleavages`) to accurately describe the actual
returned keys (`sequence`, `length`, `monoisotopic_mass`) or alternatively
modify the function that builds the dicts so they include `start`, `end`, and
`missed_cleavages`; ensure the docstring and the returned dict structure
(produced by the code that constructs each peptide dict) match exactly.
| Read an mzML (or mzXML) file and print a summary of its contents: | ||
| number of spectra, MS levels, retention time range, m/z range, |
There was a problem hiding this comment.
Input-format claim is broader than the implemented loader.
Line [4] advertises mzML/mzXML, but Line [86] only uses MzMLFile().load(...). Please either narrow the docs/help text to mzML-only or add explicit mzXML handling.
Also applies to: 73-87
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/proteomics/spectrum_file_info/spectrum_file_info.py` around lines 4 -
5, The docstring/help claims support for "mzML (or mzXML)" but the loader call
MzMLFile().load(...) only handles mzML; update the code to make the behavior
match the docs by either changing the top-level help text to "mzML only" or
adding explicit mzXML handling: detect the input file extension inside
spectrum_file_info.py and call the appropriate loader (e.g., use
MzXMLFile().load(...) for .mzXML files) so that the loader selection (MzMLFile
vs MzXMLFile) aligns with the advertised formats.
| mzs, intensities = spec.get_peaks() | ||
| if len(mzs) > 0: | ||
| mz_min = min(mz_min, float(mzs.min())) | ||
| mz_max = max(mz_max, float(mzs.max())) | ||
| tic_values.append(float(intensities.sum())) | ||
|
|
||
| return { | ||
| "n_spectra": len(spectra), | ||
| "ms_levels": ms_levels, | ||
| "rt_range_sec": (rt_min, rt_max), | ||
| "mz_range": (mz_min, mz_max), | ||
| "tic_total": sum(tic_values), | ||
| "tic_max": max(tic_values) if tic_values else 0.0, | ||
| "tic_per_spectrum": tic_values, | ||
| } |
There was a problem hiding this comment.
summarise_experiment drops zero-peak spectra and can emit invalid m/z ranges.
At Line [60], TIC is appended only when peaks exist, so tic_per_spectrum may not align with spectrum count/indexing. Also, if all spectra have no peaks, Line [66] returns inf/-inf for mz_range, which is misleading in CLI output.
Suggested fix
@@
- mz_min = float("inf")
- mz_max = float("-inf")
+ mz_min = float("inf")
+ mz_max = float("-inf")
+ has_peak_data = False
@@
- mzs, intensities = spec.get_peaks()
- if len(mzs) > 0:
+ mzs, intensities = spec.get_peaks()
+ tic_values.append(float(intensities.sum()))
+ if len(mzs) > 0:
+ has_peak_data = True
mz_min = min(mz_min, float(mzs.min()))
mz_max = max(mz_max, float(mzs.max()))
- tic_values.append(float(intensities.sum()))
@@
+ mz_range = (mz_min, mz_max) if has_peak_data else (None, None)
return {
@@
- "mz_range": (mz_min, mz_max),
+ "mz_range": mz_range,
@@
- mz_lo, mz_hi = summary["mz_range"]
- print(f"{'m/z range':<22}: {mz_lo:.4f} – {mz_hi:.4f}")
+ mz_lo, mz_hi = summary["mz_range"]
+ if mz_lo is None:
+ print(f"{'m/z range':<22}: n/a (no centroid/profile peaks)")
+ else:
+ print(f"{'m/z range':<22}: {mz_lo:.4f} – {mz_hi:.4f}")Also applies to: 124-126
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/proteomics/spectrum_file_info/spectrum_file_info.py` around lines 56
- 70, summarise_experiment currently skips spectra with zero peaks which
misaligns tic_per_spectrum and can leave mz_min/mz_max as inf/-inf; change the
loop that calls spec.get_peaks() so you always append a TIC value (append 0.0
when intensities is empty) and only update mz_min/mz_max when peaks exist, then
after the loop replace any leftover inf/-inf mz bounds with a safe default
(e.g., (0.0, 0.0) or None) before returning; apply the same fix in the duplicate
block around the second occurrence (lines noted as 124-126) so functions/methods
summarise_experiment, spec.get_peaks(), mz_min, mz_max, and tic_per_spectrum are
updated accordingly.
Summary
requirements.txt,README.md,tests/conftest.py, and test file (28 tests total, all passing)ruff.toml(E/F/W/I rules, line-length 120) and ensure all scripts pass lintcontribute-script(guided new tool creation) andvalidate-script(isolated venv validation)AGENTS.md— platform-agnostic contributor guide for any AI agent (Copilot, Cursor, Gemini, etc.).github/workflows/validate.yml) that detects changed scripts on PR and validates each in a parallel isolated venvCLAUDE.mdandREADME.mdto reflect the new structurefeature_detection_proteomicsandmetabolite_feature_detectionTest plan
ruff checkwith zero errorsPYTHONPATH=<script> python -m pytest <script>/tests/ -vscripts/contribute-scriptskill works end-to-end by adding a new tool🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Chores