Add Darwin/macOS support for SSS and Tang pins#546
Add Darwin/macOS support for SSS and Tang pins#546sini wants to merge 6 commits intolatchset:masterfrom
Conversation
clevis-encrypt-sss.c includes sys/epoll.h but never uses any epoll functions. This unused include prevents compilation on platforms that lack epoll (e.g., macOS/Darwin).
clevis-decrypt-sss.c used epoll to monitor child process output file descriptors. epoll is Linux-specific and prevents compilation on other POSIX platforms such as macOS/Darwin. Replace epoll with poll(), which is POSIX standard and functionally equivalent for the small number of file descriptors monitored here. The pollfds array is dynamically allocated to match the number of child processes.
pipe2() is a Linux-specific extension (requires _GNU_SOURCE) that atomically creates a pipe with flags. Replace it with the POSIX equivalent: pipe() followed by fcntl(F_SETFD, FD_CLOEXEC). The atomicity difference is irrelevant here since the program is single-threaded — there is no risk of a concurrent fork leaking file descriptors between the pipe() and fcntl() calls. This enables compilation on platforms that lack pipe2(), such as macOS/Darwin.
The LUKS test directory is included unconditionally by the parent
meson.build (only gated by cross-compilation), but all LUKS tests
require the cryptsetup binary. When cryptsetup is unavailable (e.g.,
on macOS/Darwin or minimal build environments), the build fails at
configure time.
Make cryptsetup optional and use subdir_done() to skip the entire
test directory when it is not found. This also avoids the
luksmeta_data.get('OLD_CRYPTSETUP') error that occurs when
libcryptsetup was not detected.
Move the jq find_program() call after the cryptsetup guard since
it is only used within LUKS tests.
|
I'll take a look and see if I can reproduce the timeout test failure locally. EDIT: I reproduced it and think I have the root cause -- working on the fix, moving this to draft until I've resolved it. |
When a pin's file descriptor is closed after reading, it must be removed from the poll set. Unlike epoll (which automatically removes closed fds), poll() will continue to return events for closed fds. Additionally, poll() can return POLLHUP/POLLERR/POLLNVAL when a child process exits or encounters errors. These events must be handled to avoid infinite loops, but only when there's no data to read (POLLIN not set) - otherwise we might discard valid data from a process that wrote output then exited. This fix: - Sets closed fds to -1 in the pollfds array (poll ignores negative fds) - Handles error/hangup events by cleaning up failed pins, but only when POLLIN is not set, ensuring we read all available data first
|
The epoll to poll conversion had a subtle bug with fd lifecycle management. When epoll monitors a closed fd, it automatically removes it from the set. Poll doesn't - it keeps returning POLLHUP indefinitely, causing an infinite loop. Why it wasn't caught initially: I only tested on Darwin (which was the whole point of the port), where the Tang/SSS functionality worked fine for my use case. That was an oversight on my part, I've now verified the whole suite works on all tested platforms. The bug only surfaced when running the full Linux test suite, which exercised edge cases like child process failures and invalid configurations. The fix:
|
Summary
This adds macOS/Darwin compatibility to the SSS and Tang pins by replacing
Linux-specific APIs with POSIX equivalents and making the LUKS test suite
gracefully skip when cryptsetup is unavailable.
Tested on macOS 26 (aarch64-darwin) — Tang and SSS-based encryption and
decryption both work correctly.
This may also help with support for #541 and #504
Changes
sss: remove unused
sys/epoll.hinclude fromclevis-encrypt-sss.cThis header was included but never used. Removing it fixes a compile error on
Darwin where
sys/epoll.hdoes not exist.sss: replace Linux
epollwith POSIXpollinclevis-decrypt-sss.cepollis Linux-specific. The SSS decrypt pin monitors a small number of childprocess file descriptors, so
poll()is functionally equivalent and is availableon all POSIX platforms.
sss: replace
pipe2with portablepipe+fcntlinsss.cpipe2(fd, O_CLOEXEC)is Linux-specific. The replacement usespipe()followedby
fcntl(F_SETFD, FD_CLOEXEC)on both descriptors. This is safe because thecall()function operates in a single-threaded context (the fork followsimmediately), so there is no window for a descriptor leak.
luks: make
cryptsetupoptional in test suiteThe LUKS test
meson.buildis included unconditionally by the parent build, butall tests require
cryptsetup, which is unavailable on macOS. Changedfind_program('cryptsetup', required: true)torequired: falsewith an earlysubdir_done()so the build completes without cryptsetup. No test logic isaltered — when cryptsetup is present, all tests run as before.
Motivation
Clevis is used in NixOS disk encryption workflows (Tang + SSS) to generate and
decrypt JWE-wrapped keys. Being able to run
clevis encrypt tangandclevis encrypt ssson macOS enables Darwin-based workstations to provisionNixOS hosts with encrypted disks without requiring a Linux VM.
Test plan
clevis encrypt tang/clevis decryptround-trip on macOSclevis encrypt sss/clevis decryptround-trip on macOScryptsetup is present)