Skip to content

Add module spec command#6859

Merged
bentsherman merged 1 commit intomasterfrom
260220-generate-meta-yml
Apr 6, 2026
Merged

Add module spec command#6859
bentsherman merged 1 commit intomasterfrom
260220-generate-meta-yml

Conversation

@jorgee
Copy link
Copy Markdown
Contributor

@jorgee jorgee commented Feb 24, 2026

Summary

  • Adds nextflow module generate-meta CLI subcommand that parses a module's main.nf and generates a meta.yml template pre-populated with all inferable values
  • Supports both V1 (classic qualifier syntax: val, path, tuple) and V2 (static type syntax: reads: Path, (meta, reads): Tuple<String, Path>)
  • Implements two-pass static analysis for V2 output type inference: builds a symbol map from input parameters and exec block assignments, then resolves output types via declared type → symbol map lookup → RHS expression inference chain
  • Adds -force flag to overwrite an existing meta.yml and -dry-run flag to print to stdout without writing

Changes

  • CmdModule.groovy — registers the new generate-meta subcommand
  • ModuleGenerateMeta.groovy — CLI entry point with path resolution, overwrite guard, dry-run routing
  • MetaYmlGenerator.groovy — AST extraction (V1 + V2) and YAML rendering via SnakeYAML
  • MetaYmlGeneratorTest.groovy — 14 Spock unit tests covering V1 qualifiers, V2 typed inputs, symbol map inference, tuple sub-entries, YAML rendering
  • ModuleGenerateMetaTest.groovy — 10 Spock CLI tests covering generation, overwrite guard, dry-run, error cases
  • VariableScopeVisitor.java — minor related fix

Test plan

  • Run ./gradlew :nextflow:test --tests "nextflow.module.MetaYmlGeneratorTest" — all 14 tests pass
  • Run ./gradlew :nextflow:test --tests "nextflow.cli.module.ModuleGenerateMetaTest" — CLI tests pass
  • Manual: nextflow module generate-meta <module-dir> generates a valid meta.yml
  • Manual: re-running without -force errors; with -force regenerates; -dry-run prints to stdout

🤖 Generated with Claude Code

@jorgee jorgee requested a review from a team as a code owner February 24, 2026 19:26
@jorgee jorgee marked this pull request as draft February 24, 2026 19:27
@jorgee
Copy link
Copy Markdown
Contributor Author

jorgee commented Mar 2, 2026

Changes since initial PR

MetaYmlGenerator.groovy

path() type resolution (extractPathMethodType)

  • Wildcard patterns (*, ?, {}, []) → list
  • Literal filenames → file
  • arity: '1'file; any other arity (e.g. '1..*') → list

Tuple rendering (buildChannelList)

  • Tuples are now rendered as nested YAML lists instead of being flattened into the parent list
  • Named tuples (declared with emit:) are wrapped under their emit identifier key: {emit_name: [sub_entries...]}
  • Anonymous tuples (no emit name) render as a plain nested list: [[sub_entries...]]

V1 emit: support (extractV1Tuple)

  • Reads the emit: named argument from V1 tuple outputs and uses it as the tuple's channel identifier

NamedArgumentListExpression handling

  • extractIdentifierFromArgs now skips a leading named-argument map so qualifiers with options (e.g. path "*.bam", arity: '1') correctly extract the pattern as the identifier
  • Same fix applied in extractTupleSubEntries

Tools placeholder removed

  • The tools: section is no longer generated; it must be filled in manually

New RenderOptions class

  • Holds the five required manifest fields: name, version, description, license, authors
  • Any null field falls back to a TODO: placeholder; provided values are written as-is

Updated YAML header

  • Now includes guidance on reviewing auto-detected val types and path patterns

ModuleGenerateMeta.groovy

Five new CLI flags for required manifest fields:

Flag Description
-name Module name in scope/name format
-version Module version
-description Short description
-license SPDX license identifier
-author Author name (repeatable)

Tests

MetaYmlGeneratorTest.groovy — expanded from 14 to 37 tests:

  • Full coverage of all V1 input/output qualifiers: val, path, file (deprecated), env, stdin, stdout, tuple
  • All glob wildcard variants (*, ?, {}, []) via parameterised test
  • path arity combinations for both V1 and V2
  • V2 path(), env(), and stdout() output expressions
  • Tuple render tests using SnakeYAML round-trip assertion (parse YAML back and verify structure)
  • RenderOptions tests: provided values used verbatim; null values fall back to TODO placeholders

ModuleGenerateMetaTest.groovy — two new tests:

  • Provided CLI flags appear in the output with no TODO placeholders for those fields
  • Omitted CLI flags produce TODO placeholders

Docs

  • docs/cli.md — added ### Generating module metadata narrative section with flag table and type-detection guidance
  • docs/reference/cli.md — added full generate-meta subcommand reference entry (anchor cli-module-generate-meta) with all flags, type hints, and examples

Comment thread docs/reference/cli.md Outdated
Comment thread docs/reference/cli.md Outdated
Comment thread docs/reference/cli.md Outdated
@jorgee
Copy link
Copy Markdown
Contributor Author

jorgee commented Mar 4, 2026

Update:

  • Rename generate-meta subcommand by spec
  • Allow module reference as primary argument and add -scope when fullname is not possible
  • Remove TODOS in module fields. Still in input/output descriptions or when no type could be inferred.
  • Input/Output validation in ModuleSpec to avoid pushing arguments with TODOs
  • Some refactor: ModuleGenerataMeta renamed to CmdModuleSpec and MetaYmlGenerator integrated in ModuleSpec

@pditommaso
Copy link
Copy Markdown
Member

Nit, I'd likely call this create-meta there's a more canonical term for file creation (I understand that now "generate" feels trendy ..)

@pditommaso
Copy link
Copy Markdown
Member

Code Review Findings

Critical (Must Fix)

1. Bug: authors guarded by license check in asMap()
ModuleSpec.groovy:293-294 — checks this.license instead of this.authors:

// CURRENT (wrong)
if( this.license )
    manifest['authors'] = this.authors

// FIX
if( this.authors )
    manifest['authors'] = this.authors

Authors will be silently dropped when license is null, or emit null when license is set but authors isn't.

2. Potential NPE in extractMethodCallType
ModuleSpec.groovy:776methodTarget can be null for unresolved method calls:

// CURRENT (missing null-safe on methodTarget)
return resolveTypeFromDeclaredType(simpleTypeName(methodExpr.methodTarget.returnType?.name))

// FIX
return resolveTypeFromDeclaredType(simpleTypeName(methodExpr.methodTarget?.returnType?.name))

Important (Should Fix)

# Issue Location
3 YAML header still says generate-meta instead of spec ModuleSpec.groovy:58
4 moduleName field lacks type declaration in @CompileStatic class CmdModuleSpec.groovy:71
5 validate() checks type.equals(DESCRIPTION_TODO) — should be description.equals(DESCRIPTION_TODO) ModuleSpec.groovy:87
6 resolveAsPath uses args[0] instead of module param CmdModuleSpec.groovy:129

Suggestions

  • buildParamSpec always emits DESCRIPTION_TODO even if entry.description is already populated (line 323)
  • load() doesn't close the InputStream from Files.newInputStream (line 141)
  • Log messages reference $args (full list) instead of $module (lines 117, 119, 123)
  • No test covering the authors field rendering in asMap() — would have caught bug A process that define only an input of type 'each' never stops #1

🤖 Generated with Claude Code

@jorgee
Copy link
Copy Markdown
Contributor Author

jorgee commented Mar 6, 2026

Nit, I'd likely call this create-meta there's a more canonical term for file creation (I understand that now "generate" feels trendy ..)

@ben suggested to call it 'spec'. This is the name in the current implementation. 'create-meta', 'create-spec' are also good alternatives to me.

I fixed other issues and comments. Apart from the name, it is ready for review from my side

@jorgee jorgee marked this pull request as ready for review March 6, 2026 19:48
@jorgee jorgee requested a review from a team as a code owner March 6, 2026 19:48
Base automatically changed from 251117-module-system-implementation to 251117-module-system March 10, 2026 13:22
Base automatically changed from 251117-module-system to master March 17, 2026 09:34
@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 26, 2026

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 1b8ffcc
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/69d3dd2f02c313000895d667

@bentsherman bentsherman changed the title Add nextflow module generate-meta command Add module spec command Mar 27, 2026
@pditommaso
Copy link
Copy Markdown
Member

What's the state of this?

@bentsherman
Copy link
Copy Markdown
Member

Jorge has done his part. I'm going to test it, clean it up, and try to merge this week

@bentsherman bentsherman added this to the 26.04 milestone Apr 2, 2026
@bentsherman bentsherman mentioned this pull request Apr 3, 2026
6 tasks
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
@bentsherman bentsherman force-pushed the 260220-generate-meta-yml branch from bb5ac5e to 1b8ffcc Compare April 6, 2026 16:19
@bentsherman bentsherman merged commit 049e2a4 into master Apr 6, 2026
23 of 24 checks passed
@bentsherman bentsherman deleted the 260220-generate-meta-yml branch April 6, 2026 17:19
pditommaso pushed a commit that referenced this pull request Apr 7, 2026
Co-authored-by: Ben Sherman <bentshermann@gmail.com>
pditommaso added a commit that referenced this pull request Apr 7, 2026
* Add resourceAllocation field to trace record

Expose scheduler-allocated resources (cpuShares, memoryMiB, accelerators,
time) in the trace record. The value is taken from the last TaskAttempt's
resources, falling back to the TaskState's resourceAllocation if no
attempts exist.

Also bump sched-client to 0.46.0-SNAPSHOT which renames
TaskState.resourceRequirement to resourceAllocation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

* Revert "Add accelerator request to trace record (#6703)"

This reverts commit 00f35b3.

The accelerator and accelerator_type fields in the trace record are
superseded by the resourceAllocation field which carries the actual
scheduler-allocated resources including accelerator info.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

* Bump sched-client to 0.47.0 (#6987) [ci fast]

* Bump sched-client to 0.47.0 and update prediction model support

- Upgrade sched-client from 0.41.0-SNAPSHOT to 0.47.0
- Add qr/v2 prediction model to supported values description
- Remove client-side prediction model validation (moved to backend)
- Fix getResourceRequirement() -> getResourceAllocation() API change

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

* Fix test to use renamed resourceAllocation API

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

---------

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

* Bump org.apache.groovy from 4.0.30 to 4.0.31 (#6985) [ci fast]

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Docs: document Wave support for module resources directory (#6984) [ci skip]

* Docs: document Wave support for module resources directory

Update Wave and module docs to explain that the module `resources/`
directory is automatically included in Wave-provisioned containers,
removing the need for ADD/COPY Dockerfile commands.

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

* Update docs/module.md [ci skip]

Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

* Update docs/wave.md [ci skip]

Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

---------

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>

* Remove stale reference to soon-to-be EOL'd AWS Linux 2 (#6970) [ci skip]

* Use inline metadata from trace create response (#6976)

---------

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: jorgee <jorge.ejarque@seqera.io>

* Replace Auth0 with Platform OIDC PKCE for auth login (#6953)

* Document workflow output lineage types (#6972)

Co-authored-by: Ben Sherman <bentshermann@gmail.com>

* docs: Add output labels use cases and details (#6986)

Co-authored-by: Ben Sherman <bentshermann@gmail.com>

* docs: Improve process directive docs (#6990)

Co-authored-by: Ben Sherman <bentshermann@gmail.com>

* Typed workflows (#6881)

* Add `module create` subcommand (#6992)

Co-authored-by: Ben Sherman <bentshermann@gmail.com>

* Add `module validate` subcommand (#6993)

Co-authored-by: Ben Sherman <bentshermann@gmail.com>

* Add `module spec` command (#6859)

Co-authored-by: Ben Sherman <bentshermann@gmail.com>

* docs: Add migration guide reference to legacy operators page (#7010)

* Use npr-client API instead of custom ModuleRegistryClient (#7012) [ci fast]

* Use npr-client API instead of custom ModuleRegistryClient

Replace the custom ModuleRegistryClient with the npr-client library,
delegating HTTP registry interactions to the shared client. This removes
~500 lines of duplicated HTTP/retry/auth logic.

- Delete ModuleRegistryClient.groovy
- Update call sites to use new npr-client method names
  (getModule, getModuleRelease, searchModules, downloadModuleRelease,
   publishModuleRelease)
- Update all tests to match new API

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

* Bump npr-api and npr-client to 0.22.0 [ci fast]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

---------

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Restore accelerator and accelerator_type fields in TraceRecord

Keep the existing trace fields for requested accelerators alongside
the new resourceAllocation field for allocated resources.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

* Populate accelerator trace fields from task config and update tests

Set accelerator request count and type in the trace record from
the process accelerator directive. Add corresponding test coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

---------

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Chris Hakkaart <chris.hakkaart@seqera.io>
Co-authored-by: Clint Valentine <valentine.clint@gmail.com>
Co-authored-by: Jorge Ejarque <jorgee@users.noreply.github.com>
Co-authored-by: Ben Sherman <bentshermann@gmail.com>
Co-authored-by: Adam Talbot <12817534+adamrtalbot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants