Skip to content

feat: Add opaque pagination#231

Open
madpah wants to merge 5 commits into
mainfrom
feat/opaque-result-pagination
Open

feat: Add opaque pagination#231
madpah wants to merge 5 commits into
mainfrom
feat/opaque-result-pagination

Conversation

@madpah
Copy link
Copy Markdown
Collaborator

@madpah madpah commented Mar 12, 2026

This PR (currently) adds Opaque Pagination to the /products endpoint as described in #179 - for review and comment before rolling it out to the other endpoints.

…iscussion)

Signed-off-by: Paul Horton <phorton@sonatype.com>
@madpah madpah requested a review from oej as a code owner March 12, 2026 08:55
@madpah madpah self-assigned this Mar 12, 2026
@madpah madpah added the Prio 1 label Mar 12, 2026
Signed-off-by: Paul Horton <phorton@sonatype.com>
Comment thread spec/openapi.yaml Outdated
Comment thread spec/openapi.yaml
@madpah madpah requested review from oej and ppkarwasz March 12, 2026 09:38
`reuslts` is now required in Products Response

Signed-off-by: Paul Horton <phorton@sonatype.com>
Signed-off-by: Paul Horton <phorton@sonatype.com>
@madpah
Copy link
Copy Markdown
Collaborator Author

madpah commented Mar 12, 2026

Pagination has been added to the following endpoints:

  • /product/{uuid}/releases
  • /productReleases
  • /products
  • /component/{uuid}/releases
  • /components
  • /componentReleases
  • /componentRelease/{uuid}/collections
  • /productRelease/{uuid}/collections

FYI @ppkarwasz @oej

Copy link
Copy Markdown
Contributor

@taleodor taleodor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @madpah , great work there.

I believe we should bring back all instances of id-type and id-value filters as those are not filters and not part of pagination. In relation to pagination, the server would need to take into account whether specific filtering was used. If that gets too complex, maybe we need a separate conversation on this.

Also we may need extra nesting due to behaviour of some generators, I'll try to test this over weekend based on your changes - essentially, at some point of TEA iterations, I was getting things like "inlinePaginationClass" without clear attribution to what class exactly was paginated. I'll get back on this but need a little extra time.

Comment thread spec/openapi.yaml
Comment thread spec/openapi.yaml
Comment thread spec/openapi.yaml
Comment thread spec/openapi.yaml
@madpah
Copy link
Copy Markdown
Collaborator Author

madpah commented Mar 13, 2026

Hi @madpah , great work there.

I believe we should bring back all instances of id-type and id-value filters as those are not filters and not part of pagination. In relation to pagination, the server would need to take into account whether specific filtering was used. If that gets too complex, maybe we need a separate conversation on this.

Also we may need extra nesting due to behaviour of some generators, I'll try to test this over weekend based on your changes - essentially, at some point of TEA iterations, I was getting things like "inlinePaginationClass" without clear attribution to what class exactly was paginated. I'll get back on this but need a little extra time.

Understood - the purpose of those fields was not immediately clear - so I'll read add and aim to make it clear they are about result filtering and NOT pagination.

Signed-off-by: Paul Horton <phorton@sonatype.com>
@madpah
Copy link
Copy Markdown
Collaborator Author

madpah commented Mar 13, 2026

Hi @madpah , great work there.

I believe we should bring back all instances of id-type and id-value filters as those are not filters and not part of pagination. In relation to pagination, the server would need to take into account whether specific filtering was used. If that gets too complex, maybe we need a separate conversation on this.

Also we may need extra nesting due to behaviour of some generators, I'll try to test this over weekend based on your changes - essentially, at some point of TEA iterations, I was getting things like "inlinePaginationClass" without clear attribution to what class exactly was paginated. I'll get back on this but need a little extra time.

FWIW - the generated libs from this schema seem fine and work for me - if you see issues, let me know - might require a fix upstream in OpenAPI Generator 😉

@taleodor
Copy link
Copy Markdown
Contributor

So the issue I mentioned is happening here, see below list of classes for Java server generator and notice TeaInlineObject ones:

-a----         3/15/2026   1:14 PM           8539 TeaArtifact.java
-a----         3/15/2026   1:14 PM           5807 TeaArtifactFormat.java
-a----         3/15/2026   1:14 PM           1629 TeaArtifactType.java
-a----         3/15/2026   1:14 PM           3141 TeaChecksum.java
-a----         3/15/2026   1:14 PM           1577 TeaChecksumType.java
-a----         3/15/2026   1:14 PM           4234 TeaCle.java
-a----         3/15/2026   1:14 PM           2842 TeaCleDefinitions.java
-a----         3/15/2026   1:14 PM          13734 TeaCleEvent.java
-a----         3/15/2026   1:14 PM           1611 TeaCleEventType.java
-a----         3/15/2026   1:14 PM           3992 TeaCleSupportDefinition.java
-a----         3/15/2026   1:14 PM           3209 TeaCleVersionSpecifier.java
-a----         3/15/2026   1:14 PM           7442 TeaCollection.java
-a----         3/15/2026   1:14 PM           1466 TeaCollectionBelongsToType.java
-a----         3/15/2026   1:14 PM           3275 TeaCollectionUpdateReason.java
-a----         3/15/2026   1:14 PM           1541 TeaCollectionUpdateReasonType.java
-a----         3/15/2026   1:14 PM           1995 TeaComplianceDocumentType.java
-a----         3/15/2026   1:14 PM           4241 TeaComponent.java
-a----         3/15/2026   1:14 PM           3556 TeaComponentRef.java
-a----         3/15/2026   1:14 PM           3878 TeaComponentReleaseWithCollection.java
-a----         3/15/2026   1:14 PM           3970 TeaDiscoveryInfo.java
-a----         3/15/2026   1:14 PM           2530 TeaErrorResponse.java
-a----         3/15/2026   1:14 PM           3137 TeaIdentifier.java
-a----         3/15/2026   1:14 PM           1387 TeaIdentifierType.java
-a----         3/15/2026   1:14 PM           4876 TeaInlineObject.java
-a----         3/15/2026   1:14 PM           4819 TeaInlineObject1.java
-a----         3/15/2026   1:14 PM           4784 TeaInlineObject2.java
-a----         3/15/2026   1:14 PM           4770 TeaInlineObject3.java
-a----         3/15/2026   1:14 PM           4791 TeaInlineObject4.java
-a----         3/15/2026   1:14 PM           3765 TeaPaginationDetails.java
-a----         3/15/2026   1:14 PM           4279 TeaProduct.java
-a----         3/15/2026   1:14 PM          10530 TeaProductRelease.java
-a----         3/15/2026   1:14 PM          10079 TeaRelease.java
-a----         3/15/2026   1:14 PM           7266 TeaReleaseDistribution.java
-a----         3/15/2026   1:14 PM           4745 TeaTeaServerInfo.java
-a----         3/15/2026   1:14 PM           1389 TeaUnknownErrorType.java

Steps to reproduce:

npx @openapitools/openapi-generator-cli generate -i openapi.yaml -g spring -o tea-server/ --additional-properties="useSpringBoot3=true,modelNamePrefix=Tea"

Generally, AI should know how to fix this (it'll create some additional nested structures), but let me know if you need input from me.

@madpah
Copy link
Copy Markdown
Collaborator Author

madpah commented Mar 17, 2026

What's the issue with the TeaInline* objects being generated @taleodor? This is fine in other languages for sure - so not sure of the issue here.

@taleodor
Copy link
Copy Markdown
Contributor

What's the issue with the TeaInline* objects being generated @taleodor? This is fine in other languages for sure - so not sure of the issue here.

Basically, it makes it very hard to produce server code and then makes code unmaintainable because:

  1. These are real classes we need to reference and they refer to specific objects and judging by name you have no idea what they are.
  2. Numbers (TeaInlineObject1, 2, 3...) may change across updates and I'm not even sure if they are consistent across generation over the same schema file - making it nearly impossible to maintain properly.

@madpah
Copy link
Copy Markdown
Collaborator Author

madpah commented Mar 26, 2026

Yeah - I see the inline objects - but don't understand what the issue is here @taleodor - is it because the naming is not deterministic, or something else?

@taleodor
Copy link
Copy Markdown
Contributor

Yeah - I see the inline objects - but don't understand what the issue is here @taleodor - is it because the naming is not deterministic, or something else?

Yes, essentially we can't build proper server code like that. Like we need to use these classes to build pagination but we can't do it in a meaningful way - it's like working with obfuscated code.

@madpah
Copy link
Copy Markdown
Collaborator Author

madpah commented Mar 30, 2026

Not sure I agree with the assessment @taleodor - I've used this before in both generated server and client code. The only real issue I've experienced is non-deterministic names of Classes which can be a pain.

Do you have a repo or commit somewhere that illustrates your pain that we can reference?

@taleodor
Copy link
Copy Markdown
Contributor

I believe we shouldn't leave it like that in any case since it's public spec and this creates a lot of friction and uncertainty on the implementation side. I'll propose a fix to this PR - should have time over the weekend.

@taleodor
Copy link
Copy Markdown
Contributor

taleodor commented Apr 5, 2026

@madpah I opened #243 against your PR branch which should fix this.

Comment thread spec/openapi.yaml
name: pageToken
description: |
An opaque token used to retrieve the next page of results.
This should be copied exactly from the `nextPageToken` field of a previous response.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we define how pageToken interacts with the other query parameters?

For cursor/token pagination, clients need to know whether pageToken already encapsulates the original query state, including filters, sortField, sortOrder, and possibly pageSize.

For example, if the first request uses:

sortField=createdDate&sortOrder=desc&pageSize=25

and the next request sends:

pageToken=...&sortField=version&sortOrder=asc&pageSize=100

what should the server do?

Possible options:

  1. The server MUST ignore sortField, sortOrder, filters, and pageSize when pageToken is present.
  2. The server MUST return 400 if parameters conflict with the state encoded in pageToken.
  3. The client MUST repeat the same parameters, and mismatch is an invalid request.

I think the spec should choose one behavior before v1.0 to avoid incompatible implementations. Defining this behavior now will ensure forward compatibility and consistent behavior across different TEA server implementations.

Comment thread spec/openapi.yaml
$ref: "#/components/schemas/component"
It must always be supplied in responses.
required:
- hasNext
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we clarify the expected semantics of nextPageToken when hasNext is false?

Right now nextPageToken is required, non-nullable, and described as always supplied. If there is no next page, it is unclear whether the server should return an empty string, a terminal token, or some other value.

Would it be cleaner to make nextPageToken optional and present only when hasNext=true?

For example:

  • hasNext=true: nextPageToken MUST be present.
  • hasNext=false: nextPageToken MUST be absent.

Alternatively, if the intention is to always return a checkpoint token for future polling/sync, maybe that should be modeled separately from a next-page token, because those are slightly different semantics.

Comment thread spec/openapi.yaml
description: |
A flag (to aid clients) to know whether there is a next page of results to fetch.

`nextPageToken` will always be supplied, hence this hint is included to aid clients.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hasNext description currently says that nextPageToken will always be supplied.

If we decide that nextPageToken should only be present when hasNext=true, this description should be updated accordingly. Otherwise, the spec should clarify what value nextPageToken has when hasNext=false.
Making nextPageToken optional when hasNext is false would simplify client-side logic, as consumers wouldn't need to handle 'empty' or 'dummy' tokens for terminal pages.

Comment thread spec/openapi.yaml
- $ref: "#/components/parameters/page-size"
- $ref: "#/components/parameters/page-token"
- $ref: "#/components/parameters/sort-field-collection"
- $ref: "#/components/parameters/sort-order"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we define a deterministic tie-breaker for each sortable resource type?

The addition of sortField is helpful, but fields like createdDate, releaseDate, version, and name are not guaranteed to be unique. Without a stable tie-breaker, cursor pagination can still produce duplicate or missing results when multiple records share the same sort value.

A simple rule could be:

Results MUST be ordered by the selected sortField, then by UUID as a deterministic tie-breaker.

This would make token pagination more reliable and easier to implement consistently.

Comment thread spec/openapi.yaml
- $ref: "#/components/parameters/page-size"
- $ref: "#/components/parameters/page-token"
- $ref: "#/components/parameters/sort-field-collection"
- $ref: "#/components/parameters/sort-order"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we define a deterministic tie-breaker for each sortable resource type?

The addition of sortField is helpful, but fields like createdDate, releaseDate, version, and name are not guaranteed to be unique. Without a stable tie-breaker, cursor pagination can still produce duplicate or missing results when multiple records share the same sort value.

A simple rule could be:

Results MUST be ordered by the selected sortField, then by UUID as a deterministic tie-breaker.

This would make token pagination more reliable and easier to implement consistently.

Comment thread spec/openapi.yaml
schema:
type: string
enum:
- name
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sort-field-component:
enum:
- name

and

sort-field-product:
enum:
- name

If name is the default sort field for products and components, do we also need to specify whether names are unique within a TEA server?

If names are not guaranteed to be unique, token pagination still needs a deterministic secondary ordering, such as UUID. If names are intended to be unique, that constraint should probably be documented in the corresponding product/component schema descriptions.

Comment thread spec/openapi.yaml
enum:
- asc
- desc
default: asc
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed sortOrder defaults to asc globally. Would it be worth considering resource-specific defaults instead?

For security-related resources like product/component releases (often sorted by createdDate) or collections (sorted by version), consumers typically expect the most recent data first (newest-to-oldest).

If a universal asc default is preferred for simplicity, perhaps we could add a small note in the description to help set expectations for implementers/consumers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants