OpenMS · ypriverol · Mar 24, 2026 · Mar 24, 2026 · Mar 24, 2026 · Mar 24, 2026
diff --git a/.claude/skills/contribute-script.md b/.claude/skills/contribute-script.md
@@ -0,0 +1,109 @@
+---
+name: contribute-script
+description: Guide creation of a new pyopenms script contribution — scaffolding through validation
+---
+
+# Contribute Script
+
+Guide an AI agent through creating a new pyopenms CLI tool for the agentomics repo. Follow every step — this is a rigid skill.
+
+## Prerequisites
+
+Read `AGENTS.md` in the repo root for the full contributor guide and code patterns.
+
+## Steps
+
+### 1. Understand the tool
+
+Ask the user:
+- What does this tool do? What pyopenms functionality does it use?
+- What gap in OpenMS/pyopenms does it fill?
+
+### 2. Determine the domain
+
+Ask: Is this a **proteomics** or **metabolomics** tool? If neither fits, discuss whether a new domain directory is needed.
+
+### 3. Pick a name
+
+Choose a descriptive snake_case name for the tool (e.g. `peptide_mass_calculator`, `isotope_pattern_matcher`). Confirm with the user.
+
+### 4. Create a feature branch
+
+```bash
+git checkout -b add/<tool_name>
+```
+
+### 5. Scaffold the directory
+
+```bash
+mkdir -p scripts/<domain>/<tool_name>/tests
+```
+
+Create these files:
+
+**`requirements.txt`:**
+```
+pyopenms
+```
+Add any additional dependencies the script needs (one per line, no version pins).
+
+**`tests/conftest.py`:**
+```python
+import sys
+import os
+
+import pytest
+
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
+
+try:
+    import pyopenms  # noqa: F401
+
+    HAS_PYOPENMS = True
+except ImportError:
+    HAS_PYOPENMS = False
+
+requires_pyopenms = pytest.mark.skipif(not HAS_PYOPENMS, reason="pyopenms not installed")
+```
+
+### 6. Write the script
+
+Create `scripts/<domain>/<tool_name>/<tool_name>.py` following these patterns:
+
+- Module-level docstring with description, supported features, and CLI usage examples
+- pyopenms import guard:
+  ```python
+  try:
+      import pyopenms as oms
+  except ImportError:
+      sys.exit("pyopenms is required. Install it with:  pip install pyopenms")
+  ```
+- `PROTON = 1.007276` constant where mass-to-charge calculations are needed
+- Importable functions as the primary interface (with type hints and numpy-style docstrings)
+- `main()` function with argparse CLI
+- `if __name__ == "__main__": main()` guard
+
+### 7. Write tests
+
+Create `scripts/<domain>/<tool_name>/tests/test_<tool_name>.py`:
+
+- Import `requires_pyopenms` from conftest
+- Decorate test classes with `@requires_pyopenms`
+- Use `from <tool_name> import <function>` inside test methods
+- For file-I/O scripts: generate synthetic data using pyopenms objects in test fixtures, write to `tempfile.TemporaryDirectory()`
+- Cover: basic functionality, edge cases, key parameters
+
+### 8. Write README
+
+Create `scripts/<domain>/<tool_name>/README.md` with a brief description and CLI usage examples.
+
+### 9. Validate
+
+Invoke the `validate-script` skill on the new script directory. Both ruff and pytest must pass.
+
+### 10. Commit
+
+```bash
+git add scripts/<domain>/<tool_name>/
+git commit -m "Add <tool_name>: <brief description>"
+```
diff --git a/.claude/skills/validate-script.md b/.claude/skills/validate-script.md
@@ -0,0 +1,34 @@
+---
+name: validate-script
+description: Validate a pyopenms script in an isolated venv — runs ruff lint and pytest
+---
+
+# Validate Script
+
+Validate any script in the agentomics repo by running ruff and pytest in a fresh isolated venv.
+
+## Steps (follow exactly — rigid skill)
+
+1. **Identify the script directory.** If the user provided a path, use it. Otherwise, ask which script to validate. The path should be `scripts/<domain>/<tool_name>/`.
+
+2. **Verify the directory structure.** Confirm it contains:
+   - `<tool_name>.py`
+   - `requirements.txt`
+   - `tests/` directory with at least one `test_*.py` file
+
+3. **Create a temporary venv and run validation.** Execute these commands:
+
+   ```bash
+   SCRIPT_DIR=<path-to-script-directory>
+   VENV_DIR=$(mktemp -d)
+   python -m venv "$VENV_DIR"
+   "$VENV_DIR/bin/python" -m pip install -r "$SCRIPT_DIR/requirements.txt"
+   "$VENV_DIR/bin/python" -m pip install pytest ruff
+   "$VENV_DIR/bin/python" -m ruff check "$SCRIPT_DIR/"
+   PYTHONPATH="$SCRIPT_DIR" "$VENV_DIR/bin/python" -m pytest "$SCRIPT_DIR/tests/" -v
+   rm -rf "$VENV_DIR"
+   ```
+
+4. **Report results.** Summarize pass/fail for both ruff and pytest. If either fails, show the relevant error output so the user can fix it.
+
+5. **Clean up.** Ensure the temporary venv is removed even if validation fails.
diff --git a/.github/workflows/validate.yml b/.github/workflows/validate.yml
@@ -0,0 +1,65 @@
+name: Validate Scripts
+
+on:
+  pull_request:
+    paths:
+      - 'scripts/**'
+
+jobs:
+  detect-changes:
+    runs-on: ubuntu-latest
+    outputs:
+      matrix: ${{ steps.detect.outputs.matrix }}
+      has_changes: ${{ steps.detect.outputs.has_changes }}
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - id: detect
+        name: Detect changed script directories
+        run: |
+          # Note: github.base_ref is only available on pull_request events
+          # Find all script directories that changed in this PR
+          CHANGED=$(git diff --name-only origin/${{ github.base_ref }}...HEAD -- 'scripts/' \
+            | grep -oP 'scripts/[^/]+/[^/]+/' \
+            | sort -u \
+            | jq -R -s -c 'split("\n") | map(select(length > 0))')
+
+          if [ "$CHANGED" = "[]" ] || [ -z "$CHANGED" ]; then
+            echo "has_changes=false" >> "$GITHUB_OUTPUT"
+            echo "matrix=[]" >> "$GITHUB_OUTPUT"
+          else
+            echo "has_changes=true" >> "$GITHUB_OUTPUT"
+            echo "matrix=$CHANGED" >> "$GITHUB_OUTPUT"
+          fi
+
+  validate:
+    needs: detect-changes
+    if: needs.detect-changes.outputs.has_changes == 'true'
+    runs-on: ubuntu-latest
+    strategy:
+      fail-fast: false
+      matrix:
+        script_dir: ${{ fromJson(needs.detect-changes.outputs.matrix) }}
+    name: Validate ${{ matrix.script_dir }}
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+
+      - name: Create venv and install dependencies
+        run: |
+          python -m venv /tmp/validate_venv
+          /tmp/validate_venv/bin/python -m pip install -r ${{ matrix.script_dir }}requirements.txt
+          /tmp/validate_venv/bin/python -m pip install pytest ruff
+
+      - name: Lint with ruff
+        run: |
+          /tmp/validate_venv/bin/python -m ruff check ${{ matrix.script_dir }}
+
+      - name: Run tests
+        run: |
+          PYTHONPATH=${{ matrix.script_dir }} /tmp/validate_venv/bin/python -m pytest ${{ matrix.script_dir }}tests/ -v
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,106 @@
+# AGENTS.md — AI Contributor Guide
+
+This file instructs AI agents (Claude Code, GitHub Copilot, Cursor, Gemini, etc.) how to contribute scripts to the agentomics repository.
+
+## Project Purpose
+
+Agentomics is a collection of standalone CLI tools built with [pyopenms](https://pyopenms.readthedocs.io/) for proteomics and metabolomics workflows. These tools fill gaps not yet covered by OpenMS/pyopenms. All code in this repo is written by AI agents.
+
+## Contribution Requirements
+
+Every script must be a **self-contained directory** under `scripts/<domain>/<tool_name>/`:
+
+```
+scripts/<domain>/<tool_name>/
+├── <tool_name>.py        # The tool itself
+├── requirements.txt      # pyopenms + any script-specific deps (no version pins)
+├── README.md             # Brief description + CLI usage examples
+└── tests/
+    ├── conftest.py       # Shared test config (see below)
+    └── test_<tool_name>.py
+```
+
+### Rules
+
+- `<domain>` is `proteomics` or `metabolomics`
+- `requirements.txt` always includes `pyopenms` with no version pin — builds against latest
+- No cross-script imports — each script is fully independent
+- No `__init__.py` files — these are NOT Python packages
+- No scripts that duplicate functionality already in OpenMS/pyopenms
+
+## Code Patterns
+
+### Script structure
+
+Every script must have:
+
+1. **Module docstring** with description, features, and usage examples
+2. **pyopenms import guard:**
+   ```python
+   import sys
+   try:
+       import pyopenms as oms
+   except ImportError:
+       sys.exit("pyopenms is required. Install it with:  pip install pyopenms")
+   ```
+3. **Importable functions** as the primary interface (with type hints and numpy-style docstrings)
+4. **`main()` function** with argparse CLI
+5. **`if __name__ == "__main__": main()`** guard
+6. **`PROTON = 1.007276`** constant where mass-to-charge calculations are needed
+
+### Test structure
+
+Every `tests/conftest.py` must contain:
+
+```python
+import sys
+import os
+
+import pytest
+
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
+
+try:
+    import pyopenms  # noqa: F401
+    HAS_PYOPENMS = True
+except ImportError:
+    HAS_PYOPENMS = False
+
+requires_pyopenms = pytest.mark.skipif(not HAS_PYOPENMS, reason="pyopenms not installed")
+```
+
+Test files:
+- Decorate test classes with `@requires_pyopenms` from conftest
+- Import script functions inside test methods: `from <tool_name> import <function>`
+- For file-I/O scripts: generate synthetic data using pyopenms objects, write to `tempfile.TemporaryDirectory()`
+
+## Validation
+
+Every script must pass validation in an **isolated venv** before it can be merged. Run these commands from the repo root:
+
+```bash
+SCRIPT_DIR=scripts/<domain>/<tool_name>
+VENV_DIR=$(mktemp -d)
+python -m venv "$VENV_DIR"
+"$VENV_DIR/bin/python" -m pip install -r "$SCRIPT_DIR/requirements.txt"
+"$VENV_DIR/bin/python" -m pip install pytest ruff
+"$VENV_DIR/bin/python" -m ruff check "$SCRIPT_DIR/"
+PYTHONPATH="$SCRIPT_DIR" "$VENV_DIR/bin/python" -m pytest "$SCRIPT_DIR/tests/" -v
+rm -rf "$VENV_DIR"
+```
+
+Both ruff and pytest must pass with zero errors.
+
+## Linting
+
+Ruff is configured in `ruff.toml` at the repo root:
+- Line length: 120
+- Rules: E (pycodestyle errors), F (pyflakes), W (pycodestyle warnings), I (isort)
+
+## What NOT to Do
+
+- Do not add cross-script imports
+- Do not add dependencies to a shared/root requirements file
+- Do not create scripts that duplicate existing pyopenms CLI tools or OpenMS TOPP tools
+- Do not pin pyopenms to a specific version
+- Do not add `__init__.py` files
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,63 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Purpose
+
+Agentomics is a collection of standalone CLI tools built with [pyopenms](https://pyopenms.readthedocs.io/) for proteomics and metabolomics workflows. These tools fill gaps not yet covered by OpenMS/pyopenms. All code in this repo is agentic-only development — written entirely by AI agents.
+
+## Commands
+
+```bash
+# Install dependencies for a specific script
+pip install -r scripts/proteomics/peptide_mass_calculator/requirements.txt
+
+# Lint a specific script
+ruff check scripts/proteomics/peptide_mass_calculator/
+
+# Run tests for a specific script
+PYTHONPATH=scripts/proteomics/peptide_mass_calculator python -m pytest scripts/proteomics/peptide_mass_calculator/tests/ -v
+
+# Lint all scripts
+ruff check scripts/
+
+# Run all tests across all scripts
+for d in scripts/*/*/; do PYTHONPATH="$d" python -m pytest "$d/tests/" -v; done
+
+# Run a script directly
+python scripts/proteomics/peptide_mass_calculator/peptide_mass_calculator.py --sequence PEPTIDEK --charge 2
+python scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py --formula C6H12O6
+```
+
+## Architecture
+
+### Per-Script Directory Structure
+
+Each script is a self-contained directory under `scripts/<domain>/<tool_name>/`:
+
+```
+scripts/<domain>/<tool_name>/
+├── <tool_name>.py        # The tool (importable functions + argparse CLI)
+├── requirements.txt      # pyopenms + script-specific deps
+├── README.md             # Usage examples
+└── tests/
+    ├── conftest.py       # requires_pyopenms marker + sys.path setup
+    └── test_<tool_name>.py
+```
+
+Domains: `proteomics/`, `metabolomics/`
+
+### Key Patterns
+
+- pyopenms import wrapped in try/except with user-friendly error message
+- Mass-to-charge: `(mass + charge * PROTON) / charge` with `PROTON = 1.007276`
+- Every script has dual interface: importable functions + argparse CLI + `__main__` guard
+- Tests use `@requires_pyopenms` skip marker from conftest.py
+- File-I/O scripts use synthetic test data generated with pyopenms objects
+
+## Contributing
+
+See `AGENTS.md` for the full AI contributor guide. Two Claude Code skills are available:
+
+- **`contribute-script`** — guided workflow for adding a new script
+- **`validate-script`** — validate any script in an isolated venv (ruff + pytest)