Skip to content

hello; how to fix this error; errno: 97 - Address family not supported by protocol #49

@tianhao909

Description

@tianhao909

python /app/software1/vidur/vidur/profiling/collectives/main.py
--num_workers_per_node_combinations 2
--collective send_recv
2025-02-14 07:59:21,469 INFO worker.py:1821 -- Started a local Ray instance.
0%| | 0/994 [00:00<?, ?it/s](BenchmarkRunner pid=10922) INFO 02-14 07:59:30 benchmark_runner.py:83] Initializing gpu id: 1, Rank: 1, num_workers: 2, comm_id: 50704, devices_per_node: 2, max_devices_per_node: 8, ip_addr: 33.254.158.199, CUDA_VISIBLE_DEVICES: 1
(BenchmarkRunner pid=10922) [W socket.cpp:697] [c10d] The client socket cannot be initialized to connect to [x08j01357.cloud.sqa.na131.tbsite.net]:50704 (errno: 97 - Address family not supported by protocol).
(BenchmarkRunner pid=10896) [W socket.cpp:464] [c10d] The server socket cannot be initialized on [::]:50704 (errno: 97 - Address family not supported by protocol).
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 994/994 [02:25<00:00, 6.84it/s]
(BenchmarkRunner pid=10896) INFO 02-14 07:59:30 benchmark_runner.py:83] Initializing gpu id: 0, Rank: 0, num_workers: 2, comm_id: 50704, devices_per_node: 2, max_devices_per_node: 8, ip_addr: 33.254.158.199, CUDA_VISIBLE_DEVICES: 0
(BenchmarkRunner pid=10896) [W socket.cpp:697] [c10d] The client socket cannot be initialized to connect to [x08j01357.cloud.sqa.na131.tbsite.net]:50704 (errno: 97 - Address family not supported by protocol).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions