Conversation
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-jsc for:arch=aarch64/nvidia/grace |
|
New job on instance
edit: job exceeded its 1-day time limit. According to https://gist.github.com/boegelbot/b64a7290ab9a66973b6aed13ec38a1dd, this could take ~2 days. |
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2 |
|
New job on instance
|
|
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2 |
|
New job on instance
|
I'm seeing lots of errors that look like the following one (i.e. with @Flamefire have you seen such errors before by any chance? |
|
May be due to having a too new glibc, according to conda-forge/pytorch-cpu-feedstock#350 (comment)? |
is /tmp mounted with What is really strange: The dlopen error leads to a tensor comparison error which doesn't make sense to me You could also try with 2.7.1: easybuilders/easybuild-easyconfigs#23923 |
I checked in our build container, but it doesn't seem to use that mount option |
|
Also see a lot of these: |
That will require some more work, e.g. #1278 needs to be deployed first. We don't have a CPU-only version of 2.7.1 as far as I can see? |
This could be the culprit: I see some google results suggesting " cannot enable executable stack as shared object requires: Invalid argument" can be fixed with
But isn't |
I dropped creating a CPU-only version after user complaints of the "strong(ly) GPU-accelerated" module doesn't support GPUs at all. |
I also found similar results, and when searching in the PyTorch repo I also found a commit that adds that flag here:
We filter bintuils in EESSI (https://github.com/EESSI/software-layer-scripts/blob/main/EESSI-extend-easybuild.eb#L48), so in that sense it's correct that it's picking up this
That definitely makes sense! I wanted to try the CPU-only version first, as I imagined it would cause fewer build issues 😅 . But then I'll wait until the CUDA for 2025.06 is ingested, and will then give 2.7.1 a try. |
|
I'm closing this one, will try to works towards 2.7.1 instead (see #1374). |
This will be fun.