Skip to content

feat(tasks): server-side task_metadata filter + create-time hook#218

Merged
declan-scale merged 5 commits intomainfrom
declan-scale/metadata-filtering
May 5, 2026
Merged

feat(tasks): server-side task_metadata filter + create-time hook#218
declan-scale merged 5 commits intomainfrom
declan-scale/metadata-filtering

Conversation

@declan-scale
Copy link
Copy Markdown
Collaborator

@declan-scale declan-scale commented May 4, 2026

Summary

Two related changes to the tasks surface so the egp-annotation /agents/custom UI can stop fetch-then-filtering and stop chaining task/createevent/sendPUT /tasks/{id}:

  • Server-side filtering on GET /tasks. Adds a task_metadata query param (JSON-encoded object, applied as a Postgres JSONB @> containment filter) and a status query param. Adds a GIN (jsonb_path_ops) index on tasks.task_metadata so containment lookups stay fast at scale. Threaded through repository → service → use case → route. Rejects malformed JSON, non-objects, and empty objects with 400.
  • Create-time metadata on the task/create RPC. CreateTaskRequest now accepts task_metadata, persisted on the task row at creation. Only stamped at creation — re-issuing task/create with the same name does not overwrite metadata; use PUT /tasks/{id} for updates.
  • Latent ACP leak fix (uncovered while wiring create-time metadata). Pre-existing code embedded the full TaskEntity into every ACP-bound payload, which would now leak task_metadata to the agent. Added a _task_for_acp(task) helper in agent_acp_service.py that returns a model_copy(update={"task_metadata": None}), applied at all 5 ACP-payload construction sites: create_task, send_message, send_message_stream, cancel_task, send_event.

Test plan

  • Unit: test_list_with_join_filters_by_task_metadata (JSONB containment at the repo).
  • Unit: TestTasksUseCaseListTasks::test_list_tasks_forwards_task_metadata_filter (use-case threading).
  • Integration: 6 new GET /tasks cases — task_metadata happy path, malformed JSON → 400, empty {} → 400, non-object JSON → 400, status filter, invalid status enum → 422.
  • Unit: test_handle_task_create_persists_task_metadatatask/create persists metadata, ACP payload omits it, raw user value never appears in the JSON.
  • Unit: test_handle_task_create_ignores_task_metadata_for_existing_task — second task/create with the same name does NOT overwrite the row's metadata.
  • Unit: TestACPPayloadScrubsTaskMetadata — one test per ACP method (create_task, send_message, send_message_stream, cancel_task, send_event) asserting payload.params.task.task_metadata is None and the user value is absent from the serialized payload.
  • All 150 tests pass across the touched suites.
  • Migration applied locally; \d+ tasks confirms ix_tasks_metadata_gin gin (task_metadata jsonb_path_ops).

Follow-ups (out of scope here)

  • SDK regen via Stainless once this merges.
  • egp-annotation: drop the client-side task_metadata.created_by_user_id filter in useListCustomAgentTasks; drop the chained PUT /tasks/{id} in useCreateCustomAgentTask.
  • Auth-layer scoping (longer-term — pushes per-user filtering into agentex-auth so callers don't have to ask for it).

🤖 Generated with Claude Code

Greptile Summary

  • Adds server-side task_metadata (JSONB @> containment) and status filters to GET /tasks, threaded through route → use case → service → repository, with a new GIN index and proper 400 validation for malformed/empty/non-object inputs. The previously-flagged P1 (status=DELETED silently returning an empty list) is now explicitly rejected with a 400.
  • Adds task_metadata to the task/create RPC, stamped at creation only — re-issuing with the same task name is idempotent and does not overwrite existing metadata.
  • task_metadata is forwarded to ACP agents in all 5 payload sites for backward compatibility; the schema description accurately reflects this, though inline code comments in task_service.py and agents_acp_use_case.py still say "not forwarded" (covered by prior review threads).

Confidence Score: 5/5

Safe to merge — no new P0 or P1 bugs; the previously-flagged P1 (DELETED status silent empty list) is now explicitly fixed.

All previously identified P1 issues are addressed. The filter pipeline is correctly parameterized (no SQL injection risk), the GIN index uses the right operator class for @>, idempotency is tested, and validation rejects all malformed inputs. Remaining comment inaccuracies in task_service.py and agents_acp_use_case.py are P2 and were already surfaced in prior review threads.

No files require special attention.

Important Files Changed

Filename Overview
agentex/src/api/routes/tasks.py Adds status and task_metadata query params with robust validation (malformed JSON, non-object, empty, and DELETED guard); the previously-flagged P1 silent-empty-list bug for status=DELETED is now explicitly rejected with a 400.
agentex/src/domain/repositories/task_repository.py Adds JSONB containment filter (@>) for task_metadata using SQLAlchemy's .contains(), correctly placed before the status != DELETED guard; uses is not None consistently.
agentex/src/domain/services/task_service.py Threads task_metadata through create_task and list_tasks; list_tasks now builds task_filters incrementally (id + status) and uses task_filters or None to match prior behavior when both are absent.
agentex/src/domain/use_cases/agents_acp_use_case.py Passes task_metadata from CreateTaskRequest into _get_or_create_taskcreate_task; idempotency logic correctly stamps metadata only at first creation and ignores it on subsequent calls with the same name.
agentex/database/migrations/alembic/versions/2026_05_04_1111_add_tasks_metadata_gin_index_e9c4ff9e6542.py Adds GIN (jsonb_path_ops) index on tasks.task_metadata idempotently (CREATE INDEX IF NOT EXISTS); jsonb_path_ops is the correct operator class for @> containment queries.
agentex/tests/integration/api/tasks/test_tasks_api.py 6 new integration tests covering task_metadata containment happy-path, malformed JSON, empty object, non-object, status filter, invalid enum, and DELETED guard.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Route as tasks.py (route)
    participant UC as TasksUseCase
    participant Svc as AgentTaskService
    participant Repo as TaskRepository

    Client->>Route: GET /tasks?task_metadata={...}&status=RUNNING
    Route->>Route: json.loads(task_metadata), validate dict non-empty, reject DELETED
    Route->>UC: list_tasks(status=RUNNING, task_metadata={...})
    UC->>Svc: list_tasks(status=RUNNING, task_metadata={...})
    Svc->>Svc: build task_filters {id, status}
    Svc->>Repo: list_with_join(task_filters, task_metadata)
    Repo->>Repo: WHERE task_metadata @> :filter AND status != DELETED AND status = RUNNING
    Repo-->>Client: 200 [TaskResponse, ...]

    Client->>Route: POST /agents/rpc task/create + task_metadata
    Route->>UC: _handle_task_create(params)
    UC->>UC: _get_or_create_task(task_metadata=params.task_metadata)
    alt task does not exist
        UC->>Svc: create_task(task_metadata=...)
        Svc->>Repo: create TaskEntity with task_metadata
        Repo-->>UC: TaskEntity (metadata stamped)
    else task already exists (same name)
        UC-->>UC: return existing task (metadata NOT overwritten)
    end
    UC->>ACP: payload includes task_metadata (backward compat)
    UC-->>Client: TaskEntity
Loading

Comments Outside Diff (1)

  1. agentex/src/api/routes/tasks.py, line 74-77 (link)

    P1 DELETED status silently returns an empty list

    TaskStatus (from src.api.schemas.tasks) includes DELETED, so a caller can issue GET /tasks?status=DELETED. Inside list_with_join the hard-coded WHERE status != DELETED is already baked into the query before list() appends WHERE status = DELETED from task_filters. The resulting query has contradictory predicates and always returns 0 rows with a 200 OK — instead of returning DELETED tasks or signalling an invalid filter. Either exclude DELETED from the query-parameter enum type or add an explicit guard that rejects it with a 400.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: agentex/src/api/routes/tasks.py
    Line: 74-77
    
    Comment:
    **`DELETED` status silently returns an empty list**
    
    `TaskStatus` (from `src.api.schemas.tasks`) includes `DELETED`, so a caller can issue `GET /tasks?status=DELETED`. Inside `list_with_join` the hard-coded `WHERE status != DELETED` is already baked into the query before `list()` appends `WHERE status = DELETED` from `task_filters`. The resulting query has contradictory predicates and always returns 0 rows with a 200 OK — instead of returning DELETED tasks or signalling an invalid filter. Either exclude `DELETED` from the query-parameter enum type or add an explicit guard that rejects it with a 400.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Fix in Cursor Fix in Claude Code Fix in Codex

Reviews (4): Last reviewed commit: "fix(tasks): updated comments to reflect ..." | Re-trigger Greptile

declan-scale and others added 2 commits May 4, 2026 15:02
…tus)

Adds server-side query parameters to GET /tasks so callers no longer have to
fetch-then-filter:

- task_metadata: JSON-encoded object applied as a JSONB containment filter
  (TaskORM.task_metadata @> :value), threaded through repository → service →
  use case → route. Rejects malformed JSON, non-objects, and empty objects
  with 400.
- status: filter by TaskStatus enum value (RUNNING, COMPLETED, etc.).
- ix_tasks_metadata_gin: GIN index using jsonb_path_ops on tasks.task_metadata
  to keep containment lookups fast at scale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@declan-scale declan-scale requested a review from a team as a code owner May 4, 2026 19:04
Comment thread agentex/src/domain/repositories/task_repository.py Outdated
Copy link
Copy Markdown
Collaborator

@danielmillerp danielmillerp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm just a couple comments

logger = make_logger(__name__)


def _task_for_acp(task: TaskEntity) -> TaskEntity:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesnt seem like that big of a deal to have this get to agent but this is fine !

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fine but maybe worth a quick check that no one uses it for their agents since you could set it from put before

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Will allow the pass through so that nothing breaks.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielmillerp restored original behavior

Comment thread agentex/src/api/schemas/agents_rpc.py
@declan-scale declan-scale merged commit 67d1901 into main May 5, 2026
30 checks passed
@declan-scale declan-scale deleted the declan-scale/metadata-filtering branch May 5, 2026 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants