Skip to content

fix(cloud-manager-client): replace adm-zip with zip-lib to preserve symlinks#1451

Open
rpapani wants to merge 4 commits intomainfrom
fix/cloud-manager-client-zip-lib-symlinks
Open

fix(cloud-manager-client): replace adm-zip with zip-lib to preserve symlinks#1451
rpapani wants to merge 4 commits intomainfrom
fix/cloud-manager-client-zip-lib-symlinks

Conversation

@rpapani
Copy link
Contributor

@rpapani rpapani commented Mar 19, 2026

Summary

  • Replace adm-zip with zip-lib for zip/unzip operations in the Cloud Manager client to preserve symlinks during code import

Problem

AEM Dispatcher repositories use symlinks in enabled_farms/ and enabled_vhosts/ directories to reference configs in available_farms/ and available_vhosts/. adm-zip's addLocalFolder internally uses fs.statSync() which follows symlinks -- when a symlink target can't be resolved (e.g. on Lambda's filesystem), it throws ENOENT and the entire code import fails. Even when symlinks do resolve, adm-zip dereferences them and stores the target file content, destroying the symlink structure. On extraction, symlinks are never restored -- adm-zip has no symlink support in either direction.

This breaks dispatcher functionality since the symlink structure (enabled_farms/*.farm -> ../available_farms/*.farm) is required for AEM dispatcher to work properly.

Solution

Replace adm-zip with zip-lib, which is the only Node.js ZIP library that properly handles symlinks in both directions:

  • Zipping: archiveFolder(path, zipFile, { followSymlinks: false }) stores symlinks as standard ZIP symlink entries (Unix external attributes) instead of dereferencing them
  • Unzipping: extract() restores symlinks as actual filesystem symlinks, and includes zip-slip path traversal protection via yauzl.validateFileName() + realpath checks

Since zip-lib is file-path-based (not buffer-based like adm-zip), both methods use a temp file as an intermediary -- the zip is written to disk, read into a Buffer, then the temp file is cleaned up. The public API (zipRepository returns Promise<Buffer>, unzipRepository takes Buffer) is unchanged.

Test plan

  • All 49 existing tests pass with updated mocks
  • 100% code coverage maintained
  • Lint passes
  • Smoke-tested against actual customer repo (Lexmark) with 21+ symlinks -- symlinks survive zip/unzip round-trip
  • Validated in spacecat-dev with a repository with symlinks, the zipping (s3://spacecat-dev-importer/code/85735c5a-37d0-4ef8-93b5-10260f2d2e5d/standard/138954/163968/main/repository.zip) worked fine as expected.

…ymlinks

adm-zip's addLocalFolder uses fs.statSync which follows symlinks and crashes
with ENOENT when targets can't be resolved on Lambda. Even when symlinks resolve,
adm-zip dereferences them, destroying the symlink structure required by AEM
dispatcher configs (enabled_farms/*.farm -> ../available_farms/*.farm).

zip-lib properly handles symlinks in both directions: archiveFolder with
followSymlinks: false stores them as standard ZIP symlink entries, and extract
restores them as actual filesystem symlinks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@rpapani rpapani requested review from ramboz and solaris007 March 19, 2026 04:26
@solaris007 solaris007 added the bug Something isn't working label Mar 19, 2026
Copy link
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @rpapani,

Strengths

  • Well-motivated fix for a real customer-impacting defect - AEM Dispatcher repos depend on symlink structure (enabled_farms/*.farm -> ../available_farms/*.farm), and adm-zip's fs.statSync() follows symlinks, causing ENOENT on Lambda and destroying symlink structure. Smoke-tested against Lexmark with 21+ symlinks.
  • Public API unchanged - zipRepository still returns Promise<Buffer>, unzipRepository still accepts Buffer. Transparent to all downstream consumers.
  • zip-lib (built on yauzl/yazl) provides stronger zip-slip protection than adm-zip - two-layer defense via validateFileName() + isOutsideTargetFolder() realpath checks. This is a security improvement.
  • Explicit { followSymlinks: false } (src/index.js:327) documents intent and protects against future default changes in the library.
  • Clean try/finally cleanup patterns in both methods, with tests verifying temp directory cleanup on both success and failure paths.
  • Tests explicitly verify the followSymlinks: false option is passed (test:~line 798), and the error cleanup test now verifies both temp directories are removed - an improvement over the old test which didn't check cleanup at all.

Issues

Important (Should Fix)

1. Temp directory leak if second mkdtempSync fails in unzipRepository
src/index.js:490-491

Two mkdtempSync calls happen before the try block:

const extractPath = mkdtempSync(...); // succeeds, creates dir on disk
const zipDir = mkdtempSync(...);      // if this throws, extractPath leaks

If /tmp is nearly full (plausible on Lambda with 512 MB default), the second call fails after the first succeeds. The error propagates without entering the try/catch, so neither cleanup path runs. This is a new resource leak path not present in the old code.

Suggested fix - move zipDir inside the try block:

async unzipRepository(zipBuffer) {
  const extractPath = mkdtempSync(path.join(os.tmpdir(), CLONE_DIR_PREFIX));
  let zipDir;
  try {
    zipDir = mkdtempSync(path.join(os.tmpdir(), ZIP_DIR_PREFIX));
    const zipFile = path.join(zipDir, 'repo.zip');
    writeFileSync(zipFile, zipBuffer);
    await extract(zipFile, extractPath);
    this.log.info(`Repository extracted to ${extractPath}`);
    return extractPath;
  } catch (error) {
    rmSync(extractPath, { recursive: true, force: true });
    throw new Error(`Failed to unzip repository: ${error.message}`);
  } finally {
    if (zipDir) rmSync(zipDir, { recursive: true, force: true });
  }
}

2. Symlink targets not validated after extraction
src/index.js:496

zip-lib's extract() restores symlinks but does not validate symlink targets at creation time. A symlink entry like evil -> /etc/shadow would be created unconditionally. While zip-lib validates that no subsequent file write traverses through a previously extracted symlink, the symlink itself can point outside the extraction directory.

Exploitability is low-medium - the zip comes from S3 where it was written by zipRepository() from a Cloud Manager git clone, so the attacker would need to be a repo contributor who plants a malicious symlink among the legitimate dispatcher symlinks. On Lambda, blast radius is limited to /tmp contents and environment variables.

Suggestion: Add a post-extraction validation that checks all symlinks resolve within the extraction directory. This could also be tracked as a fast-follow if you want to ship the functional fix first.

Minor (Nice to Have)

3. Missing /tmp disk usage logging for zip operations
The class already calls #logTmpDiskUsage() after clone/push/pull, but neither zipRepository nor unzipRepository calls it. Since the new implementation writes intermediate files to /tmp (unlike the old in-memory approach), adding logging after archiveFolder and extract would provide operational visibility into the changed resource profile.

4. extract stub arguments not verified in unzip test
The test checks extractStub was called once but doesn't verify the source zip path or destination path it was called with. Compare with the zip test which properly checks all three archiveFolderStub arguments. A path construction bug would go undetected.

5. Unrelated package-lock.json changes
The diff adds "peer": true to 15+ unrelated packages and bumps versions of spacecat-shared-data-access and spacecat-shared-utils. These appear to be side effects of npm install on a different base. Not a correctness issue, but inflates the diff.

6. Inconsistent error wrapping between zip/unzip
unzipRepository wraps errors in a user-friendly message (Failed to unzip repository: ...), but zipRepository lets raw errors propagate. Minor inconsistency.

Recommendations

  • Add a brief inline comment in both methods explaining why the temp file intermediary is needed (zip-lib is path-based, not buffer-based). Two years from now, someone will wonder why the code writes to disk just to read back into a buffer.
  • Consider filing an issue on zip-lib requesting symlink target validation in its extract() path.

Assessment

Ready to merge? With fixes

The dependency swap is well-justified, the implementation is clean, and the new library is a security improvement over adm-zip. The temp directory leak (issue #1) is a small structural fix worth addressing before merge. The symlink target validation (issue #2) is recommended defense-in-depth that could be addressed here or as a fast-follow. Everything else is polish.

Next Steps

  1. Fix the temp dir leak in unzipRepository (issue #1) - small structural change.
  2. Consider symlink target validation (issue #2) - this PR or fast-follow.
  3. Minor items are optional improvements.

*/
async unzipRepository(zipBuffer) {
const extractPath = mkdtempSync(path.join(os.tmpdir(), CLONE_DIR_PREFIX));
const zipDir = mkdtempSync(path.join(os.tmpdir(), ZIP_DIR_PREFIX));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temp dir leak: If this mkdtempSync throws (e.g., /tmp full), extractPath created on the line above leaks - the error propagates without entering the try/catch. Consider moving this inside the try block with let zipDir declared above.

const zip = new AdmZip(zipBuffer);
zip.extractAllTo(extractPath, true);
writeFileSync(zipFile, zipBuffer);
await extract(zipFile, extractPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symlink targets not validated: extract() restores symlinks but does not validate their targets. A malicious symlink like evil -> /etc/shadow would be created unconditionally. Low-medium risk given the trusted S3 source, but worth a post-extraction validation or a fast-follow.

…temp dir leak, and hardening

- Add #validateSymlinks() that recursively checks all symlinks resolve within
  the repository root, called in both zipRepository (before archiving) and
  unzipRepository (after extraction) for defense-in-depth
- Fix temp directory leak in unzipRepository when second mkdtempSync fails
  by moving zipDir creation inside the try block
- Add #logTmpDiskUsage() calls to zip/unzip for operational visibility
- Wrap zipRepository errors consistently with unzipRepository pattern
- Verify extract stub arguments in tests
- Add inline comments explaining why temp file intermediary is needed

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

This PR will trigger a patch release when merged.

@rpapani rpapani requested a review from solaris007 March 19, 2026 14:03
@rpapani
Copy link
Contributor Author

rpapani commented Mar 19, 2026

@solaris007 I've incorporated all the feedback, can you please review again. Thank you!!

Copy link
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @rpapani,

Strengths

Previously flagged issues now addressed (all 6/6):

  • Temp dir leak - Fixed. zipDir moved inside the try block with let zipDir pattern and if (zipDir) guard in finally. New ENOSPC test verifies extractPath cleanup when the second mkdtempSync fails.
  • Symlink target validation - Fixed. New #validateSymlinks method recursively checks all symlinks resolve within the repository root. Called pre-zip and post-extract for defense-in-depth. The containment check is sound - uses readlinkSync (raw target, no follow) + path.resolve (lexical normalization) + path.relative (containment). Correctly handles absolute targets, relative .. traversal, dangling symlinks, and chained symlinks.
  • Disk usage logging - Fixed. #logTmpDiskUsage('zip') and #logTmpDiskUsage('unzip') added.
  • Extract stub args not verified - Fixed. Tests now assert both source and destination paths.
  • Inconsistent error wrapping - Fixed. zipRepository now wraps errors as Failed to zip repository: ..., matching unzipRepository.
  • Inline comments - Fixed. Both methods document why the temp file round-trip is needed.

Test coverage is thorough - four new tests cover ENOSPC partial failure, absolute symlink escape, relative dispatcher-style symlinks (the real-world case), and pre-zip escape detection.

Assessment

Ready to merge? Yes

All prior findings are resolved cleanly. The #validateSymlinks implementation is correct, well-placed, and well-tested. The containment check conservatively uses lexical resolution (safe direction - false positives, not false negatives). Temp file lifecycle is tight. CI is green. Ship it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants