show expected and problematic output produced by deviceQuery in GPU docs by boegel · Pull Request #139 · EESSI/docs

boegel · 2023-12-21T19:58:19Z

showing output in case it doesn't work is useful for searching purposes...

ocaisa · 2023-12-21T20:55:15Z

docs/gpu.md

+...
+```
+
+If the `deviceQuery` command can not access your GPU, you will see an error message like:


This shouldn't actually happen though, because of the Lmod guards the only scenario I can see where you would reach this is where you are using a container and the system drivers are too old

I triggered it by cleaning out the host_injections directory after loading the module.

I agree it's very unlikely that it happens, but we should mention it in the docs regardless, if only to let people easily find this page when searching for error messages.

My concern here is that the placement here makes it seem like it not working is likely, but reaching this message is actually very unlikely

Maybe a little box saying What does it look like if the command fails?

casparvl · 2023-12-22T13:20:17Z

docs/gpu.md

@@ -152,10 +152,32 @@ The only scenario where this would be required is if `$LD_LIBRARY_PATH` is modif

 ### Testing the GPU support {: #gpu_cuda_testing }


Currently, this only treats testing if you can run CUDA-enabled software from EESSI. Maybe we can also include a small instruction for testing if building new CUDA software on top of EESSI works properly. Something like this:
First, create a file hello_cuda.cu with the contents

#include <stdio.h> __global__ void helloCUDA() { printf("Hello, CUDA!\n"); } int main() { helloCUDA<<<1, 1>>>(); cudaDeviceSynchronize(); return 0; }

Then

module load CUDA/<some_version> nvcc -o hello_cuda.cu -o hello_cuda chmod u+x hello_cuda ./hello_cuda

And mention they should test this for each version of CUDA they installed in host_injections

Makes sense, but that should be done in a separate PR?

If you want, sure. I won't block this one over it :) Although I would consider it to be an integral part of "Testing the GPU support" to be honest :)

I don't see it as so integral if we are focused on software consumers, it's only integral if you want to do development-type work

ocaisa · 2023-12-28T18:49:50Z

docs/gpu.md

+If the `deviceQuery` command can not access your GPU, you will see an error message like:
+```
+cudaGetDeviceCount returned 35
+-> CUDA driver version is insufficient for CUDA runtime version
+Result = FAIL
+```
+```


Suggested change

If the `deviceQuery` command can not access your GPU, you will see an error message like:

```

cudaGetDeviceCount returned 35

-> CUDA driver version is insufficient for CUDA runtime version

Result = FAIL

```

```

!!! note "What if the `deviceQuery` command fails?"

If the `deviceQuery` command cannot access your GPU, you will see an error message like:

```

cudaGetDeviceCount returned 35

-> CUDA driver version is insufficient for CUDA runtime version

Result = FAIL

```

show expected and problematic output produced by deviceQuery in GPU docs

9ba59fa

boegel added documentation Improvements or additions to documentation enhancement New feature or request labels Dec 21, 2023

ocaisa reviewed Dec 21, 2023

View reviewed changes

casparvl reviewed Dec 22, 2023

View reviewed changes

ocaisa reviewed Dec 28, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

show expected and problematic output produced by deviceQuery in GPU docs#139

show expected and problematic output produced by deviceQuery in GPU docs#139
boegel wants to merge 1 commit intoEESSI:mainfrom
boegel:gpu_deviceQuery_output

boegel commented Dec 21, 2023

Uh oh!

ocaisa Dec 21, 2023

Uh oh!

boegel Dec 22, 2023

Uh oh!

ocaisa Dec 22, 2023

Uh oh!

ocaisa Dec 22, 2023

Uh oh!

casparvl Dec 22, 2023 •

edited

Loading

Uh oh!

casparvl Dec 22, 2023

Uh oh!

boegel Dec 22, 2023

Uh oh!

casparvl Jan 23, 2024

Uh oh!

ocaisa Aug 9, 2024 •

edited

Loading

Uh oh!

ocaisa Dec 28, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -152,10 +152,32 @@ The only scenario where this would be required is if `$LD_LIBRARY_PATH` is modif

		### Testing the GPU support {: #gpu_cuda_testing }

Conversation

boegel commented Dec 21, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

casparvl Dec 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ocaisa Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

casparvl Dec 22, 2023 •

edited

Loading

ocaisa Aug 9, 2024 •

edited

Loading