Skip to content

Integration test test_dirty_pages_force_purge fails in Altinity CI environment #1369

@CarlosFelipeOR

Description

@CarlosFelipeOR

I checked the Altinity Stable Builds lifecycle table, and the Altinity Stable Build version I'm using is still supported.

Type of problem

Bug report – CI/Test environment issue


Describe the situation

The integration test test_dirty_pages_force_purge/test.py::test_dirty_pages_force_purge
was introduced in 25.8.16 as part of upstream PR ClickHouse#93500 (“Threshold for dirty allocator pages”).

The test passes in upstream ClickHouse CI, but fails 100% in Altinity CI builds, even after reruns. This strongly indicates that the failure is Altinity CI environment-specific.

First observed in: PR #1364 (25.8.16 merge)


How to reproduce the behavior

Run the integration test in Altinity CI:

test_dirty_pages_force_purge/test.py::test_dirty_pages_force_purge

The test fails consistently with:

RuntimeError: Failed to find peak memory counter

Expected behavior

The test should either:

  1. Pass if cgroup memory peak counters are available
  2. Skip gracefully if cgroup memory peak counters are not available in the environment

Actual behavior

The test fails at line 64 with:

RuntimeError: Failed to find peak memory counter

The test attempts to read peak memory usage from one of these cgroup files:

  • /sys/fs/cgroup/memory/memory.max_usage_in_bytes (cgroup v1)
  • /sys/fs/cgroup/memory.peak (cgroup v2)

Neither file exists in the Altinity CI Docker container environment.


Root cause analysis

The cgroup v2 file memory.peak was introduced in Linux kernel 5.19+ (August 2022).
If the Altinity CI host kernel is older, or if Docker is not exposing the memory controller correctly, these files will not be available.

The test code does not handle the case where neither cgroup version provides a peak memory counter:

for path in PEAK_MEMORY_COUNTER_PATHS:
    try:
        peak_memory = int(node.exec_in_container(["cat", path]))
        break
    except Exception as ex:
        if not str(ex).lower().strip().endswith("no such file or directory"):
            raise
else:
    raise RuntimeError("Failed to find peak memory counter")

As a result, the test fails instead of skipping.


Logs, error messages, stacktraces

From CI job Integration tests (amd_binary, 5/5):

_________________________ test_dirty_pages_force_purge _________________________
[gw2] linux -- Python 3.10.12 /usr/bin/python3
start_cluster = <helpers.cluster.ClickHouseCluster object at 0x...>

    def test_dirty_pages_force_purge(start_cluster):
        if node.is_built_with_sanitizer():
            pytest.skip("Jemalloc disabled in sanitizer builds")

        purges = ""
        for _ in range(100):
            node.query("""
                SELECT arrayMap(x -> randomPrintableASCII(40), range(4096))
                FROM numbers(2048)
                FORMAT Null
            """)

            purges = node.query(
                "SELECT value from system.events where event = 'MemoryAllocatorPurge'"
            )
            if purges:
                break

            time.sleep(0.2)

        if not purges:
            raise TimeoutError("Timed out waiting for MemoryAllocatorPurge event")

        for path in PEAK_MEMORY_COUNTER_PATHS:
            try:
                peak_memory = int(node.exec_in_container(["cat", path]))
                break
            except Exception as ex:
                if not str(ex).lower().strip().endswith("no such file or directory"):
                    raise
        else:
>           raise RuntimeError("Failed to find peak memory counter")
E           RuntimeError: Failed to find peak memory counter

test_dirty_pages_force_purge/test.py:64

Possible solutions

  1. Update Altinity CI environment
    Ensure the host kernel and container configuration expose cgroup memory peak counters
    (kernel 5.19+ required for cgroup v2 memory.peak)

  2. Modify the test upstream to skip gracefully when counters are unavailable:

    else:
        pytest.skip("Peak memory counter not available in this environment")
  3. Temporarily skip the test in Altinity CI until the environment is updated


Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions