Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Hermetic CUDA, CUDNN, NCCL and NVSHMEM overview

Hermetic CUDA/CUDNN/NCCL/NVSHMEM use specific downloadable redistribution versions instead of the user’s locally installed packages. Bazel will download CUDA, CUDNN, NCCL and NVSHMEM redistributions, and then use libraries and tools as dependencies in various Bazel targets. This enables more reproducible builds for Google ML projects and supported CUDA versions.

There are three types of hermetic toolkits configurations:

  1. Recommended: Repository rules use redistributions loaded from NVIDIA repositories.

    For full CUDA toolkit hermeticity, use CUDA User Mode Driver libraries loaded from NVIDIA repositories by setting --@cuda_driver//:include_cuda_umd_libs=true (see instructions).

  2. Repository rules use redistributions loaded from custom remote locations or local files.

    This option is recommended for testing custom/unreleases redistributions, or redistributions previously loaded locally.

  3. Not recommended: Repository rules use locally-installed toolkits.

1) Standard redistributions loaded from NVIDIA repositories

Supported hermetic CUDA, CUDNN, NCCL, NVSHMEM versions

The supported CUDA versions are specified in CUDA_REDIST_JSON_DICT dictionary, gpu/cuda/cuda_redist_versions.bzl.

The supported CUDNN versions are specified in CUDNN_REDIST_JSON_DICT dictionary, gpu/cuda/cuda_redist_versions.bzl.

The supported NVSHMEM versions are specified in NVSHMEM_REDIST_JSON_DICT dictionary, gpu/cuda/cuda_redist_versions.bzl.

The .bazelrc files of individual projects have HERMETIC_CUDA_VERSION, HERMETIC_CUDNN_VERSION, HERMETIC_NCCL_VERSION, HERMETIC_NVSHMEM_VERSION environment variables set to the versions used by default when --config=cuda is specified in Bazel command options.

Environment variables controlling the hermetic CUDA/CUDNN/NCCL/NVSHMEM versions

HERMETIC_CUDA_VERSION, HERMETIC_CUDNN_VERSION, HERMETIC_NVSHMEM_VERSION, HERMETIC_NCCL_VERSION environment variables should consist of major, minor and patch redistribution version, e.g. 12.8.0.

Three ways to set the environment variables for Bazel commands:

# Add an entry to your `.bazelrc` file
build:cuda --repo_env=HERMETIC_CUDA_VERSION="12.8.0"
build:cuda --repo_env=HERMETIC_CUDNN_VERSION="9.8.0"
build:cuda --repo_env=HERMETIC_NCCL_VERSION="2.27.7"
build:cuda --repo_env=HERMETIC_NVSHMEM_VERSION="3.2.5"

# OR pass it directly to your specific build command
bazel build --config=cuda <target> \
--repo_env=HERMETIC_CUDA_VERSION="12.8.0" \
--repo_env=HERMETIC_CUDNN_VERSION="9.8.0" \
--repo_env=HERMETIC_NCCL_VERSION="2.27.7" \
--repo_env=HERMETIC_NVSHMEM_VERSION="3.2.5"

# If .bazelrc doesn't have corresponding entries and the environment variables
# are not passed to bazel command, you can set them globally in your shell:
export HERMETIC_CUDA_VERSION="12.8.0"
export HERMETIC_CUDNN_VERSION="9.8.0"
export HERMETIC_NCCL_VERSION="2.27.7"
export HERMETIC_NVSHMEM_VERSION="3.2.5"

If HERMETIC_CUDA_VERSION and HERMETIC_CUDNN_VERSION are not present, the hermetic CUDA/CUDNN repository rules will look up TF_CUDA_VERSION and TF_CUDNN_VERSION environment variables values. This is made for the backward compatibility with non-hermetic CUDA/CUDNN repository rules.

The mapping between CUDA version and NCCL distribution version to be downloaded is specified in gpu/cuda/cuda_redist_versions.bzl

Configure hermetic CUDA, CUDNN and NCCL

  1. In the downstream project dependent on rules_ml_toolchain, add the following lines to the WORKSPACE file:

    register_toolchains("@rules_ml_toolchain//cc:linux_x86_64_linux_x86_64_cuda")
    register_toolchains("@rules_ml_toolchain//cc:linux_aarch64_linux_aarch64_cuda")
    
    load(
       "@rules_ml_toolchain///gpu/cuda:cuda_json_init_repository.bzl",
       "cuda_json_init_repository",
    )
    cuda_json_init_repository()
    
    load(
       "@cuda_redist_json//:distributions.bzl",
       "CUDA_REDISTRIBUTIONS",
       "CUDNN_REDISTRIBUTIONS",
    )
    load(
       "@rules_ml_toolchain//gpu/cuda:cuda_redist_init_repositories.bzl",
       "cuda_redist_init_repositories",
       "cudnn_redist_init_repository",
    )
    cuda_redist_init_repositories(
       cuda_redistributions = CUDA_REDISTRIBUTIONS,
    )
    cudnn_redist_init_repository(
       cudnn_redistributions = CUDNN_REDISTRIBUTIONS,
    )
    
    load(
       "@rules_ml_toolchain//gpu/cuda:cuda_configure.bzl",
       "cuda_configure",
    )
    cuda_configure(name = "local_config_cuda")
    
    load(
       "@rules_ml_toolchain//gpu/nccl:nccl_redist_init_repository.bzl",
       "nccl_redist_init_repository",
    )
    nccl_redist_init_repository()
    
    load(
       "@rules_ml_toolchain//gpu/nccl:nccl_configure.bzl",
       "nccl_configure",
    )
    nccl_configure(name = "local_config_nccl")
    
  2. To enable CUDA, set TF_NEED_CUDA environment variable and enable the flag --@rules_ml_toolchain//common:enable_cuda:

    build:cuda --repo_env TF_NEED_CUDA=1
    build:cuda --@rules_ml_toolchain//common:enable_cuda
    

    To use Clang compiler for CUDA targets, set --@local_config_cuda//:cuda_compiler=clang, for NVCC compiler set --@local_config_cuda//:cuda_compiler=nvcc and TF_NVCC_CLANG environment variable.

    build:build_cuda_with_clang --@local_config_cuda//:cuda_compiler=clang
    
    build:build_cuda_with_nvcc --action_env=TF_NVCC_CLANG="1"
    build:build_cuda_with_nvcc --@local_config_cuda//:cuda_compiler=nvcc
    
  3. To select specific versions of hermetic CUDA and CUDNN, set the HERMETIC_CUDA_VERSION and HERMETIC_CUDNN_VERSION environment variables respectively. Use only supported versions. Also you need to specify the CUDA compute capabilities in HERMETIC_CUDA_COMPUTE_CAPABILITIES that define the hardware features and supported instructions for GPU architecture.

    You may set the environment variables directly in your shell or in .bazelrc file as shown below:

    build:cuda --repo_env=HERMETIC_CUDA_VERSION="12.8.0"
    build:cuda --repo_env=HERMETIC_CUDNN_VERSION="9.8.0"
    build:cuda --repo_env=HERMETIC_CUDA_COMPUTE_CAPABILITIES="sm_50,sm_60,sm_70,sm_80,compute_90"
    
  4. To enable hermetic CUDA and NVSHMEM during test execution, or when running a binary via bazel, make sure to add --@local_config_cuda//cuda:include_cuda_libs=true flag to your bazel command. It is recommended to turn this flag on in all the cases except when you release a binary or a wheel. You can provide it either directly in a shell or in .bazelrc:

    build:cuda --@local_config_cuda//cuda:include_cuda_libs=true
    

    The flag is needed to make sure that CUDA dependencies are properly provided to test executables. The flag is false by default to avoid unwanted coupling of Google-released Python wheels to CUDA binaries.

Configure hermetic CUDA User Mode Driver

The NVIDIA driver contains both the user mode CUDA driver (UMD) and kernel mode driver (KMD) necessary to run the application. Hermetic CUDA toolchain includes hermetic UMD libs.

The recommended approach is to enable complete CUDA hermeticity, including CUDA UMD libs. If it is not enabled, the linker will use system-wide CUDA UMD libs.

To enforce complete hermeticity and link in hermetic CUDA UMD, use the flag --@cuda_driver//:include_cuda_umd_libs. The default flag value is false.

You can provide it either directly in a shell or in .bazelrc:

test:cuda --@cuda_driver//:include_cuda_umd_libs=true

The version of the User Mode Driver is controlled by the environment variable HERMETIC_CUDA_UMD_VERSION. If it is not set, the version of the User Mode Driver will be the same as specified in HERMETIC_CUDA_VERSION.

For example, the combination of the parameters below enables linking NVIDIA Driver version 13.0.0, when one needs to build the target using CUDA toolkit 12.9.0.

bazel build --repo_env=HERMETIC_CUDA_VERSION=12.9.0 \
  --repo_env=HERMETIC_CUDA_UMD_VERSION=13.0.0 \
  --@cuda_driver//:include_cuda_umd_libs=true \
  ... \
  -- \
  <target>

UMD version should be compatible with KMD and CUDA Runtime versions.

  • Supported Kernel Mode Driver and User Mode Driver version combinations:

    Driver versions combination Is supported
    KMD > UMD -
    KMD <= UMD +
  • UMD and CUDA Runtime versions compatibility is described in NVIDIA documentation.

Configure hermetic NVSHMEM

  1. In the downstream project dependent on rules_ml_toolchain, add the following lines to the WORKSPACE file:

    load(
     "@rules_ml_toolchain//gpu/nvshmem:nvshmem_json_init_repository.bzl",
     "nvshmem_json_init_repository",
    )
    nvshmem_json_init_repository()
    
    load(
       "@nvshmem_redist_json//:distributions.bzl",
       "NVSHMEM_REDISTRIBUTIONS",
    )
    load(
       "@rules_ml_toolchain//gpu/nvshmem:nvshmem_redist_init_repository.bzl",
       "nvshmem_redist_init_repository",
    )
    nvshmem_redist_init_repository(
       nvshmem_redistributions = NVSHMEM_REDISTRIBUTIONS,
    )
    
    
  2. To select specific version of hermetic NCCL, set the HERMETIC_NCCL_VERSION environment variable. Use only supported versions. You may set the environment variables directly in your shell or in .bazelrc file as shown below:

    build:cuda --repo_env=HERMETIC_NCCL_VERSION="2.27.7"
    
  3. To select specific version of hermetic NVSHMEM, set the HERMETIC_NVSHMEM_VERSION environment variable. Use only supported versions. You may set the environment variables directly in your shell or in .bazelrc file as shown below:

    build:cuda --repo_env=HERMETIC_NVSHMEM_VERSION="3.2.5"
    

Upgrade hermetic CUDA/CUDNN/NCCL/NVSHMEM version

  1. Create and submit a pull request with updated CUDA_REDIST_JSON_DICT, CUDNN_REDIST_JSON_DICT, NVSHMEM_REDIST_JSON_DICT dictionaries in gpu/cuda/cuda_redist_versions.bzl.

    Update CUDA_NCCL_WHEELS in gpu/cuda/cuda_redist_versions.bzl if needed.

    Update REDIST_VERSIONS_TO_BUILD_TEMPLATES in gpu/cuda/cuda_redist_versions.bzl if needed.

    Update PTX_VERSION_DICT in gpu/cuda/cuda_redist_versions.bzl if needed.

  2. For each Google ML project create a separate pull request with updated HERMETIC_CUDA_VERSION, HERMETIC_CUDNN_VERSION, HERMETIC_NCCL_VERSION, HERMETIC_NVSHMEM_VERSION in .bazelrc file.

    The PR presubmit job executions will launch bazel tests and download hermetic CUDA/CUDNN/NVSHMEM distributions. Verify that the presubmit jobs passed before submitting the PR.

  3. For the time optimization some build/test configurations utilize mirrored .tar redistributions. The json file with information about the mirrored .tar redistributions is uploaded some time later after CUDA_REDIST_JSON_DICT, CUDNN_REDIST_JSON_DICT, NVSHMEM_REDIST_JSON_DICT are updated. One can download these files using wget "https://storage.googleapis.com/mirror.tensorflow.org/developer.download.nvidia.com/compute/cuda/redist/redistrib_<cuda_version>_tar.json" for CUDA, wget "https://storage.googleapis.com/mirror.tensorflow.org/developer.download.nvidia.com/compute/cudnn/redist/redistrib_<cudnn_version>_tar.json" for CUDNN and wget "https://developer.download.nvidia.com/compute/nvshmem/redist/redistrib_<nvshmem_version>_tar.json" for NVSHMEM. After that create and submit a pull request with updated MIRRORED_TARS_CUDA_REDIST_JSON_DICT, MIRRORED_TARS_CUDNN_REDIST_JSON_DICT, MIRRORED_TARS_NVSHMEM_REDIST_JSON_DICT dictionaries in gpu/cuda/cuda_redist_versions.bzl.

2) Custom CUDA/CUDNN/NVSHMEM archives and NCCL wheels

There are three options that allow usage of custom distributions.

Custom CUDA/CUDNN/NVSHMEM redistribution JSON files

This option allows to use custom distributions for all CUDA/CUDNN/NVSHMEM dependencies in Google ML projects.

The JSON files contain paths to individual redistributions for different OS architectures.

  1. Create cuda_redist.json and/or cudnn_redist.json and/or nvshmem_redist.json files.

    cuda_redist.json show follow the format below:

    {
       "cuda_cccl": {
           "linux-x86_64": {
               "full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
               "sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
               "strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
         },
         "linux-sbsa": {
               "full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
               "sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
               "strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
         },
       },
    }
    

    cudnn_redist.json show follow the format below:

    {
       "cudnn": {
          "linux-x86_64": {
             "cuda12": {
             "relative_path": "cudnn/linux-x86_64/cudnn-linux-x86_64-9.0.0.312_cuda12-archive.tar.xz",
             }
          },
          "linux-sbsa": {
             "cuda12": {
             "relative_path": "cudnn/linux-sbsa/cudnn-linux-sbsa-9.0.0.312_cuda12-archive.tar.xz",
             }
          }
       }
    }
    

    nvshmem_redist.json show follow the format below:

    {
       "libnvshmem": {
          "linux-x86_64": {
             "cuda12": {
             "relative_path": "libnvshmem/linux-x86_64/libnvshmem-linux-x86_64-3.2.5_cuda12-archive.tar.xz",
             }
          },
          "linux-sbsa": {
             "cuda12": {
             "relative_path": "libnvshmem/linux-sbsa/libnvshmem-linux-sbsa-3.2.5_cuda12-archive.tar.xz",
             }
          }
       }
    }
    

    Note that sha_256 and strip_prefix are optional.

    full_path should be used for the full URLs and absolute local paths starting with file:///.

  2. In the downstream project dependent on rules_ml_toolchain, update the hermetic cuda JSON repository call in WORKSPACE file. Both web links and local file paths are allowed. Example:

    _CUDA_JSON_DICT = {
       "12.4.0": [
          "file:///home/user/Downloads/redistrib_12.4.0_updated.json",
       ],
    }
    
    _CUDNN_JSON_DICT = {
       "9.0.0": [
          "https://developer.download.nvidia.com/compute/cudnn/redist/redistrib_9.0.0.json",
       ],
    }
    
    cuda_json_init_repository(
       cuda_json_dict = _CUDA_JSON_DICT,
       cudnn_json_dict = _CUDNN_JSON_DICT,
    )
    
    _NVSHMEM_JSON_DICT = {
       "3.2.5": [
          "file:///home/user/Downloads/redistrib_3.2.5.json",
       ],
    }
    
    nvshmem_json_init_repository(
       nvshmem_json_dict = _NVSHMEM_JSON_DICT,
    )
    

    If JSON files contain relative paths to distributions, the path prefix should be updated in cuda_redist_init_repositories(), cudnn_redist_init_repository(), nvshmem_redist_init_repository() calls. Example:

    cuda_redist_init_repositories(
       cuda_redistributions = CUDA_REDISTRIBUTIONS,
       cuda_redist_path_prefix = "file:///usr/Downloads/dists/",
    )
    
    nvshmem_redist_init_repositories(
       nvshmem_redistributions = NVSHMEM_REDISTRIBUTIONS,
       nvshmem_redist_path_prefix = "file:///usr/Downloads/dists/",
    )
    

Custom CUDA/CUDNN/NVSHMEM distributions

This option allows to use custom distributions for some CUDA/CUDNN/NVSHMEM dependencies in Google ML projects.

  1. In the downstream project dependent on rules_ml_toolchain, create dictionaries with distribution paths. The dictionary with CUDA distributions show follow the format below:

    _CUSTOM_CUDA_REDISTRIBUTIONS = {
       "cuda_cccl": {
           "linux-x86_64": {
               "full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
               "sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
               "strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
           },
           "linux-sbsa": {
               "full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
               "sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
               "strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
           },
       },
    }
    

    The dictionary with CUDNN distributions show follow the format below:

    _CUSTOM_CUDNN_REDISTRIBUTIONS = {
       "cudnn": {
          "linux-x86_64": {
             "cuda12": {
             "relative_path": "cudnn/linux-x86_64/cudnn-linux-x86_64-9.0.0.312_cuda12-archive.tar.xz",
             }
          },
          "linux-sbsa": {
             "cuda12": {
             "relative_path": "cudnn/linux-sbsa/cudnn-linux-sbsa-9.0.0.312_cuda12-archive.tar.xz",
             }
          }
       }
    }
    

    The dictionary with NVSHMEM distributions show follow the format below:

    _CUSTOM_NVSHMEM_REDISTRIBUTIONS = {
       "libnvshmem": {
          "linux-x86_64": {
             "cuda12": {
             "relative_path": "libnvshmem/linux-x86_64/libnvshmem-linux-x86_64-3.2.5_cuda12-archive.tar.xz",
             }
          },
          "linux-sbsa": {
             "cuda12": {
             "relative_path": "libnvshmem/linux-sbsa/libnvshmem-linux-sbsa-3.2.5_cuda12-archive.tar.xz",
             }
          }
       }
    }
    

    Note that sha_256 and strip_prefix are optional.

    full_path should be used for the full URLs and absolute local paths starting with file:///.

  2. In the same WORKSPACE file, pass the created dictionaries to the repository rule.

    If the dictionaries contain relative paths to distributions, the path prefix should be updated in cuda_redist_init_repositories(), cudnn_redist_init_repository() and nvshmem_redist_init_repository() calls.

    There is an option to customize BUILD templates when the custom redistributions have different folder structure than default ones. Note that source_dirs is mandatory, it's used for the scenarios described here.

    If the templates for the scenarios above are different, you need to provide them in version_to_templates under local key.

    register_toolchains("@rules_ml_toolchain//cc:linux_x86_64_linux_x86_64_cuda")
    register_toolchains("@rules_ml_toolchain//cc:linux_aarch64_linux_aarch64_cuda")
    
    load(
       "@rules_ml_toolchain//gpu/cuda:cuda_json_init_repository.bzl",
       "cuda_json_init_repository",
    )
    cuda_json_init_repository()
    
    load(
       "@rules_ml_toolchain//gpu/cuda:cuda_redist_init_repositories.bzl",
       "cuda_redist_init_repositories",
       "cudnn_redist_init_repository",
    )
    
    _CCCL_BUILD_TEMPLATES = {
         "cuda_cccl": {
             "repo_name": "cuda_cccl",
             "version_to_template": {
                 "any": "@rules_ml_toolchain//gpu/cuda/build_templates:cuda_cccl_github.BUILD.tpl",
             },
             "local": {
                 "source_dirs": ["include", "lib"],
                 "local_path_env_var": "LOCAL_CCCL_PATH",
                 "version_to_template": {
                     "any": "@rules_ml_toolchain//gpu/cuda/build_templates:cuda_cccl.BUILD.tpl",
                 },
             },
         },
    }
    
    cuda_redist_init_repositories(
       cuda_redistributions = _CUSTOM_CUDA_REDISTRIBUTIONS,
       cuda_redist_path_prefix = "file:///home/usr/Downloads/dists/",
       redist_versions_to_build_templates = _CCCL_BUILD_TEMPLATES,
    )
    cudnn_redist_init_repository(
       cudnn_redistributions = _CUSTOM_CUDNN_REDISTRIBUTIONS,
       cudnn_redist_path_prefix = "file:///home/usr/Downloads/dists/cudnn/"
    )
    
    load(
       "@rules_ml_toolchain//gpu/cuda:cuda_configure.bzl",
       "cuda_configure",
    )
    cuda_configure(name = "local_config_cuda")
    
    load(
       "@rules_ml_toolchain//gpu/nccl:nccl_redist_init_repository.bzl",
       "nccl_redist_init_repository",
    )
    nccl_redist_init_repository()
    
    load(
       "@rules_ml_toolchain//gpu/nccl:nccl_configure.bzl",
       "nccl_configure",
    )
    nccl_configure(name = "local_config_nccl")
    
    load(
     "@rules_ml_toolchain//gpu/nvshmem:nvshmem_json_init_repository.bzl",
     "nvshmem_json_init_repository",
    )
    nvshmem_json_init_repository()
    
    load(
       "@rules_ml_toolchain//gpu/nvshmem:nvshmem_redist_init_repository.bzl",
       "nvshmem_redist_init_repository",
    )
    nvshmem_redist_init_repository(
       nvshmem_redistributions = _CUSTOM_NVSHMEM_REDISTRIBUTIONS,
       nvshmem_redist_path_prefix = "file:///home/usr/Downloads/dists/nvshmem/"
    )
    
    

Combination of the options above

In the example below, CUDA_REDIST_JSON_DICT is merged with custom JSON data in _CUDA_JSON_DICT, and CUDNN_REDIST_JSON_DICT is merged with _CUDNN_JSON_DICT.

The distributions data in _CUDA_DIST_DICT overrides the content of resulting CUDA JSON file, and the distributions data in _CUDNN_DIST_DICT overrides the content of resulting CUDNN JSON file. The NCCL wheels data is merged from CUDA_NCCL_WHEELS and _NCCL_WHEEL_DICT.

_CUDA_JSON_DICT = {
   "12.4.0": [
      "file:///usr/Downloads/redistrib_12.4.0_updated.json",
   ],
}

_CUDNN_JSON_DICT = {
   "9.0.0": [
      "https://developer.download.nvidia.com/compute/cudnn/redist/redistrib_9.0.0.json",
   ],
}

_CUDA_DIST_DICT = {
   "cuda_cccl": {
        "linux-x86_64": {
            "full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
            "sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
            "strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
        },
        "linux-sbsa": {
            "full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
            "sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
            "strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
        },
    },,
   "libcusolver": {
      "linux-x86_64": {
            "full_path": "file:///usr/Downloads/dists/libcusolver-linux-x86_64-11.6.0.99-archive.tar.xz",
      },
      "linux-sbsa": {
         "relative_path": "libcusolver-linux-sbsa-11.6.0.99-archive.tar.xz",
      },
   },
}

_CCCL_BUILD_TEMPLATES = {
    "cuda_cccl": {
        "repo_name": "cuda_cccl",
        "version_to_template": {
            "any": "@rules_ml_toolchain//gpu/cuda/build_templates:cuda_cccl_github.BUILD.tpl",
        },
        "local": {
            "source_dirs": ["include", "lib"],
            "local_path_env_var": "LOCAL_CCCL_PATH",
            "version_to_template": {
                "any": "@rules_ml_toolchain//gpu/cuda/build_templates:cuda_cccl.BUILD.tpl",
             },
        },
    },
}

_CUDNN_DIST_DICT = {
   "cudnn": {
      "linux-x86_64": {
            "cuda12": {
               "relative_path": "cudnn-linux-x86_64-9.0.0.312_cuda12-archive.tar.xz",
            },
      },
      "linux-sbsa": {
            "cuda12": {
               "relative_path": "cudnn-linux-sbsa-9.0.0.312_cuda12-archive.tar.xz",
            },
      },
   },
}

_NCCL_WHEEL_DICT = {
    "14": {
        "x86_64-unknown-linux-gnu": {
            "2.21.5": {
                "url": "https://files.pythonhosted.org/packages/ac/9a/8b6a28b3b87d5fddab0e92cd835339eb8fbddaa71ae67518c8c1b3d05bae/nvidia_nccl_cu11-2.21.5-py3-none-manylinux2014_x86_64.whl",
            },
        },
    },
}

load(
    "@rules_ml_toolchain//gpu/cuda:cuda_redist_versions.bzl",
    "CUDA_REDIST_PATH_PREFIX",
    "CUDA_NCCL_WHEELS",
    "CUDA_REDIST_JSON_DICT",
    "CUDNN_REDIST_PATH_PREFIX",
    "CUDNN_REDIST_JSON_DICT",
)
cuda_json_init_repository(
   cuda_json_dict = CUDA_REDIST_JSON_DICT | _CUDA_JSON_DICT,
   cudnn_json_dict = CUDNN_REDIST_JSON_DICT | _CUDNN_JSON_DICT,
)

load(
   "@cuda_redist_json//:distributions.bzl",
   "CUDA_REDISTRIBUTIONS",
   "CUDNN_REDISTRIBUTIONS",
)
load(
   "@rules_ml_toolchain//gpu/cuda:cuda_redist_init_repositories.bzl",
   "cuda_redist_init_repositories",
   "cudnn_redist_init_repository",
)
load(
    "@rules_ml_toolchain//gpu/cuda:cuda_redist_versions.bzl",
    "REDIST_VERSIONS_TO_BUILD_TEMPLATES",
)
cuda_redist_init_repositories(
   cuda_redistributions = CUDA_REDISTRIBUTIONS | _CUDA_DIST_DICT,
   cuda_redist_path_prefix = "file:///usr/Downloads/dists/",
   redist_versions_to_build_templates = REDIST_VERSIONS_TO_BUILD_TEMPLATES | _CCCL_BUILD_TEMPLATES,
)
cudnn_redist_init_repository(
   cudnn_redistributions = CUDNN_REDISTRIBUTIONS | _CUDNN_DIST_DICT,
   cudnn_redist_path_prefix = "file:///usr/Downloads/dists/cudnn/"
)

load(
    "@rules_ml_toolchain//gpu/nccl:nccl_redist_init_repository.bzl",
    "nccl_redist_init_repository",
)
nccl_redist_init_repository(
   cuda_nccl_wheels = CUDA_NCCL_WHEELS | _NCCL_WHEEL_DICT,
)

3) Local toolkit installations used as sources for hermetic repositories

Warning

This feature is exclusively for developers, typically on NVIDIA teams, building both XLA/JAX and CUDA binaries. Other users should not use it.

You can use the local CUDA/CUDNN/NCCL/NVSHMEM paths as a source of redistributions. The following additional environment variables are required:

LOCAL_CUDA_PATH
LOCAL_CUDNN_PATH
LOCAL_NCCL_PATH
LOCAL_NVSHMEM_PATH

Example:

# Add an entry to your `.bazelrc` file
build:cuda --repo_env=LOCAL_CUDA_PATH="/foo/bar/nvidia/cuda"
build:cuda --repo_env=LOCAL_CUDNN_PATH="/foo/bar/nvidia/cudnn"
build:cuda --repo_env=LOCAL_NCCL_PATH="/foo/bar/nvidia/nccl"
build:cuda --repo_env=LOCAL_NVSHMEM_PATH="/foo/bar/nvidia/nvshmem"

# OR pass it directly to your specific build command
bazel build --config=cuda <target> \
--repo_env=LOCAL_CUDA_PATH="/foo/bar/nvidia/cuda" \
--repo_env=LOCAL_CUDNN_PATH="/foo/bar/nvidia/cudnn" \
--repo_env=LOCAL_NCCL_PATH="/foo/bar/nvidia/nccl" \
--repo_env=LOCAL_NVSHMEM_PATH="/foo/bar/nvidia/nvshmem"

# If .bazelrc doesn't have corresponding entries and the environment variables
# are not passed to bazel command, you can set them globally in your shell:
export LOCAL_CUDA_PATH="/foo/bar/nvidia/cuda"
export LOCAL_CUDNN_PATH="/foo/bar/nvidia/cudnn"
export LOCAL_NCCL_PATH="/foo/bar/nvidia/nccl"
export LOCAL_NVSHMEM_PATH="/foo/bar/nvidia/nvshmem"

The structure of the folders inside CUDA/CUDNN/NCCL/NVSHMEM dirs should be the following (as if the archived redistributions were unpacked into one place):

<LOCAL_CUDA_PATH>/
    include/
    bin/
    lib/
    nvvm/

The structure of the folders inside CUDNN dir should be the following:

<LOCAL_CUDNN_PATH>
    include/
    lib/

The structure of the folders inside NCCL dir should be the following:

<LOCAL_NCCL_PATH>
    include/
    lib/

The structure of the folders inside NVSHMEM dir should be the following:

<LOCAL_NVSHMEM_PATH>
    include/
    lib/
    bin/