Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
33 changes: 19 additions & 14 deletions .claude/skills/contribute-script.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,27 +23,32 @@ Ask the user:

Ask: Is this a **proteomics** or **metabolomics** tool? If neither fits, discuss whether a new domain directory is needed.

### 3. Pick a name
### 3. Pick a topic

Choose the topic directory under the selected domain from the options documented in `AGENTS.md`. Confirm the topic with the user before scaffolding files.

### 4. Pick a name

Choose a descriptive snake_case name for the tool (e.g. `peptide_mass_calculator`, `isotope_pattern_matcher`). Confirm with the user.

### 4. Create a feature branch
### 5. Create a feature branch

```bash
git checkout -b add/<tool_name>
```

### 5. Scaffold the directory
### 6. Scaffold the directory

```bash
mkdir -p scripts/<domain>/<tool_name>/tests
mkdir -p tools/<domain>/<topic>/<tool_name>/tests
```

Create these files:

**`requirements.txt`:**
```
pyopenms
click
```
Add any additional dependencies the script needs (one per line, no version pins).

Expand All @@ -66,9 +71,9 @@ except ImportError:
requires_pyopenms = pytest.mark.skipif(not HAS_PYOPENMS, reason="pyopenms not installed")
```

### 6. Write the script
### 7. Write the script

Create `scripts/<domain>/<tool_name>/<tool_name>.py` following these patterns:
Create `tools/<domain>/<topic>/<tool_name>/<tool_name>.py` following these patterns:

- Module-level docstring with description, supported features, and CLI usage examples
- pyopenms import guard:
Expand All @@ -80,30 +85,30 @@ Create `scripts/<domain>/<tool_name>/<tool_name>.py` following these patterns:
```
- `PROTON = 1.007276` constant where mass-to-charge calculations are needed
- Importable functions as the primary interface (with type hints and numpy-style docstrings)
- `main()` function with argparse CLI
- `main()` function with click CLI
- `if __name__ == "__main__": main()` guard

### 7. Write tests
### 8. Write tests

Create `scripts/<domain>/<tool_name>/tests/test_<tool_name>.py`:
Create `tools/<domain>/<topic>/<tool_name>/tests/test_<tool_name>.py`:

- Import `requires_pyopenms` from conftest
- Decorate test classes with `@requires_pyopenms`
- Use `from <tool_name> import <function>` inside test methods
- For file-I/O scripts: generate synthetic data using pyopenms objects in test fixtures, write to `tempfile.TemporaryDirectory()`
- Cover: basic functionality, edge cases, key parameters

### 8. Write README
### 9. Write README

Create `scripts/<domain>/<tool_name>/README.md` with a brief description and CLI usage examples.
Create `tools/<domain>/<topic>/<tool_name>/README.md` with a brief description and CLI usage examples.

### 9. Validate
### 10. Validate

Invoke the `validate-script` skill on the new script directory. Both ruff and pytest must pass.

### 10. Commit
### 11. Commit

```bash
git add scripts/<domain>/<tool_name>/
git add tools/<domain>/<topic>/<tool_name>/
git commit -m "Add <tool_name>: <brief description>"
```
2 changes: 1 addition & 1 deletion .claude/skills/validate-script.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Validate any script in the agentomics repo by running ruff and pytest in a fresh

## Steps (follow exactly — rigid skill)

1. **Identify the script directory.** If the user provided a path, use it. Otherwise, ask which script to validate. The path should be `scripts/<domain>/<tool_name>/`.
1. **Identify the script directory.** If the user provided a path, use it. Otherwise, ask which script to validate. The path should be `tools/<domain>/<topic>/<tool_name>/`.

2. **Verify the directory structure.** Confirm it contains:
- `<tool_name>.py`
Expand Down
48 changes: 30 additions & 18 deletions .github/workflows/validate.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
name: Validate Scripts
name: Validate Tools

on:
pull_request:
paths:
- 'scripts/**'
- 'tools/**'

jobs:
detect-changes:
Expand All @@ -17,12 +17,12 @@ jobs:
fetch-depth: 0

- id: detect
name: Detect changed script directories
name: Detect changed tool directories
run: |
# Note: github.base_ref is only available on pull_request events
# Find all script directories that changed in this PR
CHANGED=$(git diff --name-only origin/${{ github.base_ref }}...HEAD -- 'scripts/' \
| grep -oP 'scripts/[^/]+/[^/]+/' \
# Find all tool directories that changed in this PR
CHANGED=$(git diff --name-only origin/${{ github.base_ref }}...HEAD -- 'tools/' \
| grep -oP 'tools/[^/]+/[^/]+/[^/]+/' \
| sort -u \
| jq -R -s -c 'split("\n") | map(select(length > 0))')

Expand All @@ -38,28 +38,40 @@ jobs:
needs: detect-changes
if: needs.detect-changes.outputs.has_changes == 'true'
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
script_dir: ${{ fromJson(needs.detect-changes.outputs.matrix) }}
name: Validate ${{ matrix.script_dir }}
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Create venv and install dependencies
- name: Create shared venv and install dependencies
run: |
python -m venv /tmp/validate_venv
/tmp/validate_venv/bin/python -m pip install -r ${{ matrix.script_dir }}requirements.txt
/tmp/validate_venv/bin/python -m pip install pytest ruff
/tmp/validate_venv/bin/python -m pip install pyopenms numpy scipy click pytest ruff
DIRS='${{ needs.detect-changes.outputs.matrix }}'
echo "$DIRS" | jq -r '.[]' | while read -r dir; do
if [ -f "${dir}requirements.txt" ]; then
/tmp/validate_venv/bin/python -m pip install -r "${dir}requirements.txt"
fi
done

- name: Lint with ruff
- name: Lint changed tools
run: |
/tmp/validate_venv/bin/python -m ruff check ${{ matrix.script_dir }}
DIRS='${{ needs.detect-changes.outputs.matrix }}'
echo "$DIRS" | jq -r '.[]' | while read -r dir; do
echo "::group::ruff $dir"
/tmp/validate_venv/bin/python -m ruff check "$dir"
echo "::endgroup::"
done

- name: Run tests
- name: Test changed tools
run: |
PYTHONPATH=${{ matrix.script_dir }} /tmp/validate_venv/bin/python -m pytest ${{ matrix.script_dir }}tests/ -v
DIRS='${{ needs.detect-changes.outputs.matrix }}'
echo "$DIRS" | jq -r '.[]' | while read -r dir; do
if [ -d "${dir}tests/" ]; then
echo "::group::pytest $dir"
PYTHONPATH="$dir" /tmp/validate_venv/bin/python -m pytest "${dir}tests/" -v
echo "::endgroup::"
fi
done
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -205,3 +205,7 @@ cython_debug/
marimo/_static/
marimo/_lsp/
__marimo__/


#Ignore vscode AI rules
.github/instructions/codacy.instructions.md
44 changes: 26 additions & 18 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,45 @@
# AGENTS.md — AI Contributor Guide

This file instructs AI agents (Claude Code, GitHub Copilot, Cursor, Gemini, etc.) how to contribute scripts to the agentomics repository.
This file instructs AI agents (Claude Code, GitHub Copilot, Cursor, Gemini, etc.) how to contribute tools to the agentomics repository.

## Project Purpose

Agentomics is a collection of standalone CLI tools built with [pyopenms](https://pyopenms.readthedocs.io/) for proteomics and metabolomics workflows. These tools fill gaps not yet covered by OpenMS/pyopenms. All code in this repo is written by AI agents.

## Contribution Requirements

Every script must be a **self-contained directory** under `scripts/<domain>/<tool_name>/`:
Every tool must be a **self-contained directory** under `tools/<domain>/<topic>/<tool_name>/`:

```
scripts/<domain>/<tool_name>/
tools/<domain>/<topic>/<tool_name>/
├── <tool_name>.py # The tool itself
├── requirements.txt # pyopenms + any script-specific deps (no version pins)
├── requirements.txt # pyopenms + any tool-specific deps (no version pins)
├── README.md # Brief description + CLI usage examples
└── tests/
├── conftest.py # Shared test config (see below)
└── test_<tool_name>.py
```

### Topics

**Proteomics topics:** `spectrum_analysis/`, `peptide_analysis/`, `protein_analysis/`, `fasta_utils/`, `file_conversion/`, `quality_control/`, `targeted_proteomics/`, `identification/`, `ptm_analysis/`, `structural_proteomics/`, `specialized/`, `rna/`

**Metabolomics topics:** `formula_tools/`, `feature_processing/`, `spectral_analysis/`, `compound_annotation/`, `drug_metabolism/`, `isotope_labeling/`, `lipidomics/`, `export/`

### Rules

- `<domain>` is `proteomics` or `metabolomics`
- `<topic>` is one of the topic directories listed above
- `requirements.txt` always includes `pyopenms` with no version pin — builds against latest
- No cross-script imports — each script is fully independent
- No cross-tool imports — each tool is fully independent
- No `__init__.py` files — these are NOT Python packages
- No scripts that duplicate functionality already in OpenMS/pyopenms
- No tools that duplicate functionality already in OpenMS/pyopenms TOPP tools

## Code Patterns

### Script structure
### Tool structure

Every script must have:
Every tool must have:

1. **Module docstring** with description, features, and usage examples
2. **pyopenms import guard:**
Expand All @@ -44,7 +51,7 @@ Every script must have:
sys.exit("pyopenms is required. Install it with: pip install pyopenms")
```
3. **Importable functions** as the primary interface (with type hints and numpy-style docstrings)
4. **`main()` function** with argparse CLI
4. **`main()` function** with click CLI
5. **`if __name__ == "__main__": main()`** guard
6. **`PROTON = 1.007276`** constant where mass-to-charge calculations are needed

Expand All @@ -71,21 +78,21 @@ requires_pyopenms = pytest.mark.skipif(not HAS_PYOPENMS, reason="pyopenms not in

Test files:
- Decorate test classes with `@requires_pyopenms` from conftest
- Import script functions inside test methods: `from <tool_name> import <function>`
- For file-I/O scripts: generate synthetic data using pyopenms objects, write to `tempfile.TemporaryDirectory()`
- Import tool functions inside test methods: `from <tool_name> import <function>`
- For file-I/O tools: generate synthetic data using pyopenms objects, write to `tempfile.TemporaryDirectory()`

## Validation

Every script must pass validation in an **isolated venv** before it can be merged. Run these commands from the repo root:
Every tool must pass validation in an **isolated venv** before it can be merged. Run these commands from the repo root:

```bash
SCRIPT_DIR=scripts/<domain>/<tool_name>
TOOL_DIR=tools/<domain>/<topic>/<tool_name>
VENV_DIR=$(mktemp -d)
python -m venv "$VENV_DIR"
"$VENV_DIR/bin/python" -m pip install -r "$SCRIPT_DIR/requirements.txt"
"$VENV_DIR/bin/python" -m pip install -r "$TOOL_DIR/requirements.txt"
"$VENV_DIR/bin/python" -m pip install pytest ruff
"$VENV_DIR/bin/python" -m ruff check "$SCRIPT_DIR/"
PYTHONPATH="$SCRIPT_DIR" "$VENV_DIR/bin/python" -m pytest "$SCRIPT_DIR/tests/" -v
"$VENV_DIR/bin/python" -m ruff check "$TOOL_DIR/"
PYTHONPATH="$TOOL_DIR" "$VENV_DIR/bin/python" -m pytest "$TOOL_DIR/tests/" -v
rm -rf "$VENV_DIR"
```

Expand All @@ -99,8 +106,9 @@ Ruff is configured in `ruff.toml` at the repo root:

## What NOT to Do

- Do not add cross-script imports
- Do not add cross-tool imports
- Do not add dependencies to a shared/root requirements file
- Do not create scripts that duplicate existing pyopenms CLI tools or OpenMS TOPP tools
- Do not create tools that duplicate existing pyopenms CLI tools or OpenMS TOPP tools
- Do not pin pyopenms to a specific version
- Do not add `__init__.py` files
- Do not add tools that don't actually use pyopenms (pure stats/math tools belong elsewhere)
38 changes: 21 additions & 17 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,35 +9,35 @@ Agentomics is a collection of standalone CLI tools built with [pyopenms](https:/
## Commands

```bash
# Install dependencies for a specific script
pip install -r scripts/proteomics/peptide_mass_calculator/requirements.txt
# Install dependencies for a specific tool
pip install -r tools/proteomics/peptide_analysis/peptide_mass_calculator/requirements.txt

# Lint a specific script
ruff check scripts/proteomics/peptide_mass_calculator/
# Lint a specific tool
ruff check tools/proteomics/peptide_analysis/peptide_mass_calculator/

# Run tests for a specific script
PYTHONPATH=scripts/proteomics/peptide_mass_calculator python -m pytest scripts/proteomics/peptide_mass_calculator/tests/ -v
# Run tests for a specific tool
PYTHONPATH=tools/proteomics/peptide_analysis/peptide_mass_calculator python -m pytest tools/proteomics/peptide_analysis/peptide_mass_calculator/tests/ -v

# Lint all scripts
ruff check scripts/
# Lint all tools
ruff check tools/

# Run all tests across all scripts
for d in scripts/*/*/; do PYTHONPATH="$d" python -m pytest "$d/tests/" -v; done
# Run all tests across all tools
for d in tools/*/*/*/; do PYTHONPATH="$d" python -m pytest "$d/tests/" -v; done

# Run a script directly
python scripts/proteomics/peptide_mass_calculator/peptide_mass_calculator.py --sequence PEPTIDEK --charge 2
python scripts/metabolomics/isotope_pattern_matcher/isotope_pattern_matcher.py --formula C6H12O6
python tools/proteomics/peptide_analysis/peptide_mass_calculator/peptide_mass_calculator.py --sequence PEPTIDEK --charge 2
python tools/metabolomics/spectral_analysis/isotope_pattern_matcher/isotope_pattern_matcher.py --formula C6H12O6
```

## Architecture

### Per-Script Directory Structure
### Per-Tool Directory Structure

Each script is a self-contained directory under `scripts/<domain>/<tool_name>/`:
Each tool is a self-contained directory under `tools/<domain>/<topic>/<tool_name>/`:

```
scripts/<domain>/<tool_name>/
├── <tool_name>.py # The tool (importable functions + argparse CLI)
tools/<domain>/<topic>/<tool_name>/
├── <tool_name>.py # The tool (importable functions + click CLI)
├── requirements.txt # pyopenms + script-specific deps
├── README.md # Usage examples
└── tests/
Expand All @@ -47,11 +47,15 @@ scripts/<domain>/<tool_name>/

Domains: `proteomics/`, `metabolomics/`

Proteomics topics: `spectrum_analysis/`, `peptide_analysis/`, `protein_analysis/`, `fasta_utils/`, `file_conversion/`, `quality_control/`, `targeted_proteomics/`, `identification/`, `ptm_analysis/`, `structural_proteomics/`, `specialized/`, `rna/`

Metabolomics topics: `formula_tools/`, `feature_processing/`, `spectral_analysis/`, `compound_annotation/`, `drug_metabolism/`, `isotope_labeling/`, `lipidomics/`, `export/`

### Key Patterns

- pyopenms import wrapped in try/except with user-friendly error message
- Mass-to-charge: `(mass + charge * PROTON) / charge` with `PROTON = 1.007276`
- Every script has dual interface: importable functions + argparse CLI + `__main__` guard
- Every script has dual interface: importable functions + click CLI + `__main__` guard
- Tests use `@requires_pyopenms` skip marker from conftest.py
- File-I/O scripts use synthetic test data generated with pyopenms objects

Expand Down
Loading
Loading