feat[next]: GPU profiling by egparedes · Pull Request #2508 · GridTools/gt4py

egparedes · 2026-03-03T12:10:02Z

Add GPU profiling utilities to gt4py.next so program and compiled-program calls can be annotated via cupyx.profiler (e.g., NVTX/ROCTX ranges) and profiling sessions can be started/stopped around a scope.

Copilot

Pull request overview

Adds GPU profiling utilities to GT4Py Next so program and compiled-program calls can be annotated via cupyx.profiler (e.g., NVTX/ROCTX ranges) and profiling sessions can be started/stopped around a scope.

Changes:

Introduces gt4py.next.instrumentation.gpu_profiler with profile_calls() plus hook-based time-range wrappers for program/compiled-program dispatch.
Adds a CompiledProgramsPool.definition convenience property to support per-program customization (e.g., color_id).
Adds unit/integration tests and updates dependency groups (profiling, dev) for profiler-related tooling.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`src/gt4py/next/instrumentation/gpu_profiler.py`	New GPU profiling API + hook registration for time ranges and profiling session lifecycle.
`src/gt4py/next/otf/compiled_program.py`	Adds `definition` property to compiled program pools for profiler metadata/customization.
`src/gt4py/next/instrumentation/hook_machinery.py`	Small rename of `__exit__` parameter for clarity (`type_` → `exc_type`).
`tests/next_tests/unit_tests/instrumentation_tests/test_gpu_profiler.py`	Unit tests for fallback path and profiler hook classes.
`tests/next_tests/integration_tests/feature_tests/instrumentation_tests/test_gpu_profiler.py`	Integration tests for profiling call lifecycle (hook registration + ctx manager).
`tests/next_tests/integration_tests/feature_tests/instrumentation_tests/test_hooks.py`	Runs hook integration test under `gpu_profiler.profile_calls()`.
`tests/next_tests/integration_tests/multi_feature_tests/ffront_tests/test_ffront_fvm_nabla.py`	Adds an atlas-based integration test demonstrating profiling usage with `cupyx.profiler`.
`tests/next_tests/unit_tests/instrumentation_tests/test_metrics.py`	Adds `from __future__ import annotations`.
`pyproject.toml`	Adds `profiling` dependency group and includes it in `dev`.
`uv.lock`	Updates lockfile with new dependencies/groups.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T14:36:06Z

+            hooks.program_call_context.register(ProgramCallProfiler, index=0)
+            hooks.compiled_program_call_context.register(CompiledProgramCallProfiler, index=0)
+            _profile_ctx_manager = profile()
+            _profile_ctx_manager.__enter__()


start_profiling_calls() sets _profile_ctx_manager and registers hooks before calling __enter__(). If profile().__enter__() raises, the module can be left in a broken state (hooks registered, _profile_ctx_manager non-None) and subsequent calls won’t retry. Consider only assigning _profile_ctx_manager after a successful __enter__(), and unregistering hooks/resetting state on failure.

Suggested change

hooks.program_call_context.register(ProgramCallProfiler, index=0)

hooks.compiled_program_call_context.register(CompiledProgramCallProfiler, index=0)

_profile_ctx_manager = profile()

_profile_ctx_manager.__enter__()

ctx_manager = profile()

program_call_hook_registered = False

compiled_program_call_hook_registered = False

try:

hooks.program_call_context.register(ProgramCallProfiler, index=0)

program_call_hook_registered = True

hooks.compiled_program_call_context.register(CompiledProgramCallProfiler, index=0)

compiled_program_call_hook_registered = True

ctx_manager.__enter__()

except Exception:

if compiled_program_call_hook_registered:

hooks.compiled_program_call_context.remove(CompiledProgramCallProfiler)

if program_call_hook_registered:

hooks.program_call_context.remove(ProgramCallProfiler)

raise

_profile_ctx_manager = ctx_manager

Your conclusion was that we anyway can't recover from this situation, right? Because we don't capture any failure and it will lead to program exit?

Yes, exactly, that's what I had thought. Looking again at Copilot's suggestion, I think it makes sense to at least clean up the callback states in case the user catches the exception and continues with the execution. I've added a bit of cleanup code to handle this.

edopao

LGTM

havogt

lgtm, a question and a potential problem.

havogt · 2026-04-15T07:12:42Z

+        with mock.patch.object(gpu_profiler, "profile", return_value=fake_profile):
+            gpu_profiler.start_profiling_calls()
+            gpu_profiler.start_profiling_calls()
+


What about proper nesting of contexts? Wouldn't leaving the inner context already stop everything? From this test I understand that basically inner contexts should be ignored, right?

The profile_calls() context is just a convenience util for calling start_profiling_calls() and stop_profiling_calls() automatically. This is a simple on/off switch, where the first call to either start_profiling_calls() or stop_profiling_calls() would start/stop the profiler. I don't think it's worth to add anything else more complex, at least for now. If the user needs something more complicated, he can still implement it's own mechanism, calling start_profiling_calls() and stop_profiling_calls() under the hood.

Thinking about this again, the nesting behavior of the profile_calls() context handler is probably confusing. I guess we should either document it, make it raise at run-time when nested or just delete it and keep only the start_/stop_ calls. What do you think?

havogt · 2026-04-15T07:15:57Z

+            hooks.program_call_context.register(ProgramCallProfiler, index=0)
+            hooks.compiled_program_call_context.register(CompiledProgramCallProfiler, index=0)
+            _profile_ctx_manager = profile()
+            _profile_ctx_manager.__enter__()


Your conclusion was that we anyway can't recover from this situation, right? Because we don't capture any failure and it will lead to program exit?

edopao · 2026-04-15T11:57:07Z

LGTM

Copilot

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

havogt

lgtm, in case you support the nesting, it's also approved.

Copilot

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T14:44:38Z

 import gt4py.next as gtx
 from gt4py.next import common, Dims, gtfn_cpu, typing as gtx_typing
-from gt4py.next.instrumentation import hooks
+from gt4py.next.instrumentation import gpu_profiler, hooks


gpu_profiler is imported but never used in this test module. This will fail unused-import linting if enabled for tests; please remove the import or use it in an assertion.

Suggested change

from gt4py.next.instrumentation import gpu_profiler, hooks

from gt4py.next.instrumentation import hooks

Copilot · 2026-04-17T14:44:39Z

+from gt4py._core import definitions as core_definitions
+from gt4py.next.instrumentation import gpu_profiler, hooks
+
+
+HAS_CUPY = core_definitions.CUPY_DEVICE_TYPE is not None
+
+


core_definitions/HAS_CUPY are declared but never used in this test module. If this is not intended for future skip logic, please remove them to avoid unused-import/unused-variable lint failures.

Suggested change

from gt4py._core import definitions as core_definitions

from gt4py.next.instrumentation import gpu_profiler, hooks

HAS_CUPY = core_definitions.CUPY_DEVICE_TYPE is not None

from gt4py.next.instrumentation import gpu_profiler, hooks

Copilot · 2026-04-17T14:44:39Z

+    """Context manager that enables GPU profiling of GT4Py program calls within its scope."""
+    if not start_profiling_calls():
+        warnings.warn(
+            "GPU profiling of GT4Py program calls is already active."


The warning message concatenates two string literals without whitespace, producing "...already active.Nested...". Please add an explicit space/newline so the emitted UserWarning is readable (and keep the message text stable for tests).

Suggested change

"GPU profiling of GT4Py program calls is already active."

"GPU profiling of GT4Py program calls is already active. "

egparedes and others added 11 commits March 3, 2026 13:04

Add profiling with cupy

7fdabf0

Merge branch 'main' into feature/gpu-traces

897c6e9

Fix format

561cf38

WIP debugging

f5250de

More WIP fixes

3c9fd64

tt

e56a03d

Merge branch 'main' into feature/gpu-traces

e8720e6

QA fixes

64f4eec

Fix start/stop profile functionality

5a29ad9

Merge branch 'main' into feature/gpu-traces

0048fd5

Add tests and documentation

14c441b

egparedes marked this pull request as ready for review April 10, 2026 13:20

Copilot AI review requested due to automatic review settings April 10, 2026 13:20

Copilot started reviewing on behalf of egparedes April 10, 2026 13:20 View session

Remove debugging test

f99df21

Copilot AI reviewed Apr 10, 2026

View reviewed changes

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py Outdated

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py Outdated

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py Outdated

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py Outdated

egparedes added 2 commits April 10, 2026 16:23

Fixes

3a1f1b7

Clenaups

1bc09c8

egparedes requested review from Copilot and havogt April 10, 2026 14:30

Copilot AI reviewed Apr 10, 2026

View reviewed changes

egparedes added 3 commits April 10, 2026 16:49

More cleanups

27170c0

Missing renames

2dae766

Fix tests for python >=3.13

8efb11f

egparedes requested a review from edopao April 13, 2026 15:19

edopao reviewed Apr 14, 2026

View reviewed changes

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py Outdated

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py

havogt reviewed Apr 15, 2026

View reviewed changes

Address review comments

33f9e27

egparedes requested review from edopao and havogt April 15, 2026 09:57

egparedes added 2 commits April 15, 2026 18:28

Forbid nested profile_calls context

9b3eb77

Merge branch 'main' into feature/gpu-traces

63fab28

egparedes requested a review from Copilot April 15, 2026 16:31

Copilot started reviewing on behalf of egparedes April 15, 2026 16:33 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py Outdated

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py Outdated

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py Outdated

Final cleanups

dc17ec3

havogt approved these changes Apr 17, 2026

View reviewed changes

egparedes added 4 commits April 17, 2026 12:30

Allow nested start/stop profiling calls

4095e9f

Fix docstrings

844f662

Fix tests

948cd61

Merge branch 'main' into feature/gpu-traces

4987495

egparedes requested a review from Copilot April 17, 2026 14:39

Copilot started reviewing on behalf of egparedes April 17, 2026 14:40 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

egparedes commented Apr 17, 2026

View reviewed changes

Comment thread src/gt4py/next/instrumentation/gpu_profiler.py Outdated

Apply suggestion from @egparedes

da1ef1b

egparedes merged commit 75cdbdf into GridTools:main Apr 17, 2026
32 checks passed

-            hooks.program_call_context.register(ProgramCallProfiler, index=0)
-            hooks.compiled_program_call_context.register(CompiledProgramCallProfiler, index=0)
-            _profile_ctx_manager = profile()
-            _profile_ctx_manager.__enter__()
+            ctx_manager = profile()
+            program_call_hook_registered = False
+            compiled_program_call_hook_registered = False
+            try:
+                hooks.program_call_context.register(ProgramCallProfiler, index=0)
+                program_call_hook_registered = True
+                hooks.compiled_program_call_context.register(CompiledProgramCallProfiler, index=0)
+                compiled_program_call_hook_registered = True
+                ctx_manager.__enter__()
+            except Exception:
+                if compiled_program_call_hook_registered:
+                    hooks.compiled_program_call_context.remove(CompiledProgramCallProfiler)
+                if program_call_hook_registered:
+                    hooks.program_call_context.remove(ProgramCallProfiler)
+                raise
+            _profile_ctx_manager = ctx_manager

	from gt4py.next.instrumentation import gpu_profiler, hooks
	from gt4py.next.instrumentation import hooks

	"GPU profiling of GT4Py program calls is already active."
	"GPU profiling of GT4Py program calls is already active. "

Conversation

egparedes commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

havogt Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

egparedes Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

edopao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

havogt left a comment

Choose a reason for hiding this comment

Uh oh!

havogt Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

egparedes Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

egparedes Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

havogt Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

edopao commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

havogt left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

egparedes commented Mar 3, 2026 •

edited

Loading

egparedes Apr 15, 2026 •

edited

Loading