Skip to content

Tango: post-sanitisation device-class-name collision when ids differ only in '-' vs '_' #371

@gilesknap

Description

@gilesknap

Found while reviewing PR #360 commit 207d68d8 (slice #357).

validate_tango_id accepts [A-Za-z0-9_-]+ — both hyphens and underscores are allowed. tango_dev_class_name then collapses hyphens to underscores so the result is a valid Python identifier:

# transports/tango/util.py
def tango_dev_class_name(id: str) -> str:
    sanitized = id.replace("-", "_")
    if sanitized[0].isdigit():
        sanitized = "X" + sanitized
    return sanitized

The problem: two distinct, individually-valid ids "DEV-1" and "DEV_1" both sanitise to Python class name "DEV_1". If a user declares both in controllers:, TangoDSR._create_device is called twice, each type() call returns a distinct Python class object — but both share __name__ == "DEV_1". tango.server.run(tuple(self._devices), [FASTCS_TANGO_SERVER_NAME, ...]) then registers two Tango classes under the same name; whichever is created second silently overrides the first in any class-keyed lookup at the Tango DB.

The launch-framework's controllers: dict can't catch this — the dict keys "DEV-1" and "DEV_1" are distinct strings, so ruamel's duplicate-key detection doesn't fire. The collision only emerges after the Tango-specific sanitisation step, so it must be detected at the transport boundary.

The existing two-controller test in tests/test_multi_controller.py::test_tango_transport_builds_one_device_per_controller_with_id_in_name uses "ALPHA"/"BETA" which sanitise distinctly, so this case isn't covered.

Suggested fix

Detect the post-sanitisation clash inside TangoTransport.connect (or TangoDSR.__init__) and raise with the colliding inputs called out:

seen: dict[str, str] = {}
for api in controller_apis:
    id = api.path[0]
    validate_tango_id(id)
    cls_name = tango_dev_class_name(id)
    if cls_name in seen:
        raise ValueError(
            f"Controller ids {seen[cls_name]!r} and {id!r} both sanitise to "
            f"Tango device-class name {cls_name!r}; pick one variant "
            f"(hyphens and underscores are not distinguishable in Tango class names)"
        )
    seen[cls_name] = id

Add a regression test in tests/test_multi_controller.py that declares ["DEV-1", "DEV_1"] and asserts the fail-fast message.

Alternative

Tighten validate_tango_id to reject - outright. That makes Tango a slightly more restrictive lowest-common-denominator alongside GraphQL (which already disallows hyphens), and the per-transport charset table in multiple-transports.md would need updating. This is a more aggressive cut — the post-sanitisation check is preferred unless we want the docs/charset table to be cleaner.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions