Skip to content

IA-4870: Slow forms api (cont.)#2824

Merged
Phil-V merged 4 commits intodevelopfrom
IA-4870-slow-forms-api-again
Mar 24, 2026
Merged

IA-4870: Slow forms api (cont.)#2824
Phil-V merged 4 commits intodevelopfrom
IA-4870-slow-forms-api-again

Conversation

@Phil-V
Copy link
Copy Markdown
Contributor

@Phil-V Phil-V commented Mar 17, 2026

What problem is this PR solving?

Improve performance of the forms API in the forms listing and when called from the submissions list.

Related JIRA tickets

IA-4870

Changes

  • Fix issue with annotations causing a cartesian product and unnecessary joins
  • Removed an unnecessary prefetch (thanks to Stéphan's previous optimizations)
    • This should fix the slow query as originally reported from Sentry
  • Add a command to create a composite index for improved "latest submission" performance.

How to test

  • Visit the forms page and check that the forms load correctly (also check with setting all visible columns)
  • Check the form submissions page forms dropdown.

Print screen / video

/

Notes

  • Will add a celery task for WFP in a separate ticket / PR to create the index on Instances

Doc

/

@Phil-V
Copy link
Copy Markdown
Contributor Author

Phil-V commented Mar 17, 2026

Hey @beygorghor you were right to be skeptical, my previous PR on this issue didn't address the slow performance on the Forms page itself. This PR should be a substantial improvement.

I don't think there's much leeway for more optimisation (see attached query plan), but there's always the possibility of adding an extra index (see PR notes) 👀
queryplanforms.json

SELECT DISTINCT "iaso_form"."id", "iaso_form"."deleted_at", "iaso_form"."form_id", "iaso_form"."created_at", "iaso_form"."updated_at", "iaso_form"."name", "iaso_form"."device_field", "iaso_form"."location_field", "iaso_form"."correlation_field", "iaso_form"."correlatable", "iaso_form"."possible_fields", "iaso_form"."period_type", "iaso_form"."single_per_period", "iaso_form"."periods_before_allowed", "iaso_form"."periods_after_allowed", "iaso_form"."derived", "iaso_form"."uuid", "iaso_form"."label_keys", "iaso_form"."legend_threshold", "iaso_form"."change_request_mode", COUNT("iaso_mapping"."id") AS "mapping_count", CASE WHEN COUNT("iaso_mapping"."id") > 0 THEN true ELSE false END AS "has_mappings", (SELECT U0."updated_at" FROM "iaso_instance" U0 WHERE U0."form_id" = ("iaso_form"."id") ORDER BY U0."updated_at" DESC LIMIT 1) AS "instance_updated_at", (SELECT U0."id" FROM "iaso_formversion" U0 WHERE U0."form_id" = ("iaso_form"."id") ORDER BY U0."created_at" DESC LIMIT 1) AS "latest_version_id" FROM "iaso_form" INNER JOIN "iaso_project_forms" ON ("iaso_form"."id" = "iaso_project_forms"."form_id") INNER JOIN "iaso_project" ON ("iaso_project_forms"."project_id" = "iaso_project"."id") INNER JOIN "iaso_project_forms" T5 ON ("iaso_form"."id" = T5."form_id") LEFT OUTER JOIN "iaso_mapping" ON ("iaso_form"."id" = "iaso_mapping"."form_id") WHERE ("iaso_form"."deleted_at" IS NULL AND "iaso_project"."account_id" = 2 AND T5."project_id" IN (2)) GROUP BY "iaso_form"."id" ORDER BY "iaso_form"."name" 

CC @madewulf

@Phil-V Phil-V marked this pull request as ready for review March 17, 2026 11:32
@Phil-V Phil-V marked this pull request as draft March 17, 2026 12:45
@Phil-V Phil-V force-pushed the IA-4870-slow-forms-api-again branch from d510be2 to c452694 Compare March 19, 2026 13:50
@Phil-V Phil-V marked this pull request as ready for review March 19, 2026 14:34
@Phil-V Phil-V marked this pull request as draft March 19, 2026 16:04
@Phil-V
Copy link
Copy Markdown
Contributor Author

Phil-V commented Mar 23, 2026

Performance testing for "latest submission"

Trying a few different approaches with 3 million instances:

# Initial approach
# Note, this has a side-effect of slowing down the formversion prefetch query dramatically
# by creating a join on the instance table on the queryset
queryset = queryset.annotate(instance_updated_at=Max("instances__updated_at"))

Total request time: 14.959s with 14 queries

latest_instance = Instance.objects.filter(form=OuterRef("pk")).order_by("-updated_at")
queryset = queryset.annotate(instance_updated_at=Subquery(latest_instance.values("updated_at")[:1]))

Total request time: 23.708s with 14 queries

max_updated_sq = (
    Instance.objects.filter(form_id=OuterRef("id"))
    .values("form_id")
    .annotate(max_up=Max("updated_at"))
    .values("max_up")
)
queryset = queryset.annotate(
    instance_updated_at=Subquery(max_updated_sq)
)

Total request time: 13.914s with 14 queries

# Note: decently fast, but performs a SEQ scan on the instance table
instance_cte = With(
    Instance.objects.values("form_id").annotate(max_updated=Max("updated_at")),
    name="instance_max_updated_cte",
)
queryset = instance_cte.join(queryset, id=instance_cte.col.form_id, _join_type=LOUTER)
queryset = queryset.with_cte(instance_cte).annotate(instance_updated_at=instance_cte.col.max_updated)

Total request time: 1.677s with 14 queries

After adding a composite index

CREATE INDEX iaso_instance_updated_at_form_id_composite ON iaso_instance (form_id, updated_at);
latest_instance = Instance.objects.filter(form=OuterRef("pk")).order_by("-updated_at")
queryset = queryset.annotate(instance_updated_at=Subquery(latest_instance.values("updated_at")[:1]))

Total request time: 0.428s with 14 queries

queryset = queryset.annotate(instance_updated_at=Max("instances__updated_at"))

Total request time: 1.941s with 14 queries

max_updated_sq = (
    Instance.objects.filter(form_id=OuterRef("id"))
    .values("form_id")
    .annotate(max_up=Max("updated_at"))
    .values("max_up")
)
queryset = queryset.annotate(
    instance_updated_at=Subquery(max_updated_sq)
)

Total request time: 1.098s with 14 queries

instance_cte = With(
    Instance.objects.values("form_id").annotate(max_updated=Max("updated_at")),
    name="instance_max_updated_cte",
)
queryset = instance_cte.join(queryset, id=instance_cte.col.form_id, _join_type=LOUTER)
queryset = queryset.with_cte(instance_cte).annotate(instance_updated_at=instance_cte.col.max_updated)

Total request time: 0.840s with 14 queries

@Phil-V Phil-V marked this pull request as ready for review March 23, 2026 07:37
@tdethier tdethier self-requested a review March 23, 2026 10:46
Copy link
Copy Markdown
Member

@tdethier tdethier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before testing this PR, loading forms with all extra columns took ~3.5 seconds (my DB is way smaller - I have only 181k submissions).

I loaded forms again, using this PR:

  • without index: ~4 seconds
  • with index: ~3 seconds

The improvement is not as big as I had expected, but maybe it's related to the fact that I have fewer instances.

Anyway, kudos for the nice work and analysis, I appreciate that 👍

@Phil-V Phil-V merged commit cef8eb7 into develop Mar 24, 2026
13 checks passed
@Phil-V Phil-V deleted the IA-4870-slow-forms-api-again branch March 24, 2026 15:57
@quang-le quang-le added ok for release user tested Has already been tested on staging Released and removed ok for release labels Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Released user tested Has already been tested on staging

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants