proposal: bytecode-only script execution via vm module (sourceless mode)

## Summary

Add an opt-in `sourceless` mode to `vm.Script`, `vm.compileFunction()`, and `vm.SourceTextModule` that allows executing V8 bytecode without requiring the original JavaScript source code. This enables deployment scenarios where only pre-compiled bytecode is shipped, reducing bundle sizes and improving startup performance for packaged applications.

The feature would be gated behind `--experimental-vm-sourceless` and exposed as a `sourceless` option across the `vm` module APIs.

## Motivation

### The problem today

Several real-world deployment scenarios benefit from shipping pre-compiled V8 bytecode instead of (or alongside) source code:

1. **Single Executable Applications (SEA):** Node.js already supports embedding V8 code cache in SEAs via `useCodeCache: true`. However, the source must still be embedded alongside the bytecode, doubling the payload size for no runtime benefit when the code cache is valid.

2. **Edge/IoT deployments:** Devices with constrained storage and slow I/O benefit from shipping only bytecode — smaller binaries, no parse overhead, faster cold starts.

3. **Startup performance at scale:** Bundlers like `esbuild` and `ncc` already solve single-artifact distribution, but the resulting bundle still needs to be parsed and compiled by V8 on every cold start. For large applications (e.g., serverless functions, CLI tools, microservices), parse time can dominate startup. Pre-compiled bytecode eliminates the parse and compile phases entirely — V8 deserializes bytecode directly into memory, which is significantly faster than processing equivalent source.

4. **Application packagers:** Tools like [`@yao-pkg/pkg`](https://github.com/yao-pkg/pkg) currently maintain ~100 lines of custom V8 modifications across 8 files per Node.js version to enable bytecode-only execution. This is the single largest maintenance burden for the project and affects every Node.js release. Tracking issue: [yao-pkg/pkg#231](https://github.com/yao-pkg/pkg/issues/231).

### The ecosystem is already doing this — and growing fast

The [`bytenode`](https://www.npmjs.com/package/bytenode) package has ~27,000 weekly npm downloads and ~3,000 GitHub stars. It works by exploiting `vm.Script`'s `cachedData` option with a dummy source, but this approach is fragile, undocumented, and breaks across Node.js versions because V8's sanity checks were not designed for this use case.

[`@yao-pkg/pkg`](https://www.npmjs.com/package/@yao-pkg/pkg) (the actively maintained fork of Vercel's now-deprecated `pkg`) reaches **~136,000 weekly downloads** as of April 2026 — an 8x increase in daily downloads over 18 months (from ~3,200/day in Oct 2024 to ~26,500/day in Mar 2026). It is already at 50% of the deprecated `pkg`'s download volume.

Combined with `bytenode`'s 27K weekly downloads, these tools demonstrate sustained ecosystem demand for bytecode execution in Node.js. Both maintain fragile, version-specific workarounds for the lack of native support. If Node.js provided this capability natively, these workarounds would be eliminated entirely.

### What we're NOT proposing

This proposal is **not about**:
- Stabilizing V8's bytecode format across versions (bytecode remains version-locked)
- Obfuscation or DRM (bytecode is decompilable with freely available tools — see [Security Analysis §2](#2-malware-could-use-this-to-evade-source-level-scanning))
- A sandboxing mechanism for untrusted bytecode (bytecode requires the same trust level as source code)
- Cross-version or cross-architecture bytecode portability

## Proposed API

### CLI flag

```
--experimental-vm-sourceless
```

Required to unlock the `sourceless` option. Without it, passing `sourceless: true` throws `ERR_VM_SOURCELESS_MISSING_FLAG`.

### Operating modes

The `sourceless` option has two operating modes, distinguished by the presence or absence of source:

| Mode | Inputs | Behavior |
|------|--------|----------|
| **Build mode** | `sourceless: true` + source code (string) | Compiles all functions eagerly (V8's `lazy` flag set to `false`) so the resulting code cache is self-contained — every function has bytecode, not just the ones called during construction. `createCachedData()` then produces a complete bytecode blob. Without `sourceless: true`, `createCachedData()` only caches functions that V8 has compiled so far (top-level code and functions already called), leaving unvisited inner functions without bytecode — making the cache incomplete for sourceless consumption. |
| **Load mode** | `sourceless: true` + `cachedData` (Buffer) + empty source (`''`) | Deserializes and executes bytecode. Source-hash and flags-hash checks are relaxed. |

In load mode, the `source`/`code` parameter must be an empty string (`''`). This preserves the existing validation contracts of `vm.Script` (which coerces to string via `` `${code}` ``) and `vm.compileFunction()` (which calls `validateString(code)`) without any changes to their validation logic.

### `vm.Script` changes

```js
const vm = require('node:vm');
const fs = require('node:fs');

// === Build mode (build-time script): compile and produce bytecode ===
const source = fs.readFileSync('app.js', 'utf8');
const compiled = new vm.Script(source, { sourceless: true });
const bytecode = compiled.createCachedData();
fs.writeFileSync('app.jsc', bytecode);

// === Load mode (separate process / runtime): execute from bytecode only ===
const bytecode = fs.readFileSync('app.jsc');
const script = new vm.Script('', {
  sourceless: true,
  cachedData: bytecode,
});
// script.cachedDataRejected === false if bytecode is valid
script.runInThisContext();
```

### `vm.compileFunction()` changes

`vm.compileFunction()` already supports `cachedData`. This proposal adds `sourceless` support following the same pattern:

```js
const vm = require('node:vm');

// === Build mode ===
const fn = vm.compileFunction('module.exports = 42;', ['module', 'exports'], {
  sourceless: true,
  produceCachedData: true,
});
const bytecode = fn.cachedData;

// === Load mode ===
const fn = vm.compileFunction('', ['module', 'exports'], {
  sourceless: true,
  cachedData: bytecode,
});
fn(module, module.exports);
```

Note: unlike `vm.Script` and `vm.SourceTextModule`, the function returned by `vm.compileFunction()` does not currently have a `createCachedData()` method. The build mode example above uses `produceCachedData: true` and reads `fn.cachedData` instead. A follow-up improvement could add `createCachedData()` to compiled functions for API consistency, but this is not required for the initial implementation.

This is essential for CJS module loading, where `require()` wraps module source in a function wrapper. Without `vm.compileFunction()` support, bytecode execution would only work in script mode — a small fraction of real-world use cases.

**Important:** In load mode, the `params` array must match exactly what was used at build time. A mismatch would cause V8 to deserialize bytecode with incorrect parameter bindings, leading to crashes or incorrect behavior.

Packagers are expected to use existing Node.js hooks to wire `vm.compileFunction()` into the module loading pipeline: `Module._compile` for CJS, or `module.register()` / `--loader` for ESM. This keeps `require()` / `import` integration in userland, where packagers already operate, while Node.js provides the primitive.

### `vm.SourceTextModule` changes

This proposal also extends `vm.SourceTextModule` (behind `--experimental-vm-modules`) with `sourceless` support for ESM bytecode:

```js
const vm = require('node:vm');

// === Build mode ===
const m = new vm.SourceTextModule('export const x = 42;', {
  sourceless: true,
  identifier: 'app.mjs',
});
const bytecode = m.createCachedData();

// === Load mode ===
const m = new vm.SourceTextModule('', {
  sourceless: true,
  cachedData: bytecode,
  identifier: 'app.mjs',
});
await m.link(linker);  // import resolution still required
await m.evaluate();
```

`vm.SourceTextModule` already supports `cachedData` in its constructor and exposes `createCachedData()` on instances. V8's module code caching via `ScriptCompiler::CreateCodeCache(Local<UnboundModuleScript>)` is already used internally by Node.js for compile caching, built-in module caching, and SEA code cache. This proposal adds `sourceless` support on top of these existing capabilities — no new prerequisites are needed for ESM.

Module linking (`link()`) and evaluation (`evaluate()`) remain unchanged — they operate on the import/export bindings, which are separate from compilation. The module's import/export declarations are preserved in V8's code cache as part of the `ScopeInfo → SourceTextModuleInfo` chain (containing `module_requests`, `regular_imports`, `regular_exports`, and `special_exports`). This means `link()` can resolve dependencies without source — no re-parsing is needed. The module graph is still resolved at runtime.

**Timing constraint:** `createCachedData()` must be called before `evaluate()`. V8's `Module::GetUnboundModuleScript()` requires the module to be unevaluated (status must not be `kEvaluating`, `kEvaluated`, or `kErrored`). The build mode example above shows the correct order.

### API design alternative considered

An alternative to reusing the constructors is a set of static factory methods:

```js
const script = vm.Script.fromBytecode(bytecode, options);
const fn = vm.loadBytecodeFunction(bytecode, params, options);
const m = vm.SourceTextModule.fromBytecode(bytecode, options);
```

This avoids semantic overloading of the `source` parameter and makes the intent unambiguous. (Note: `vm.loadBytecodeFunction` is a standalone function rather than a method on `vm.compileFunction`, since functions in JavaScript do not conventionally carry static methods.) The tradeoff is a larger API surface and divergence from the existing `cachedData` pattern. We present the constructor approach as the primary proposal because it extends the existing `cachedData` pattern naturally and is consistent with how `bytenode` and `pkg` users already work. The TSC may prefer the factory method approach.

### Behavioral changes under `sourceless: true`

| Aspect | Normal mode | Sourceless mode |
|--------|------------|----------------|
| Source required | Yes | Only at compile time (build mode) |
| `Function.prototype.toString()` | Returns source text | Returns `"function () { [native code] }"` (or `"class {}"` for classes) |
| Stack traces | Full source positions | Line/column numbers preserved, no source preview |
| Source maps | Stored in `ScriptOrigin` | **Still functional** — source maps are independent of source code. Ship `.jsc` + `.map` for production debugging. |
| Code cache validation | Source hash + flags hash + version hash (+ payload checksum in debug builds) | Flags hash relaxed, source hash relaxed. Version hash preserved. Payload checksum behavior unchanged from normal mode. |
| Lazy compilation | Enabled | Disabled (all functions eagerly compiled in build mode) |
| Bytecode GC flushing | Enabled | Disabled (bytecode cannot be regenerated) |
| `eval()` / `new Function()` | Normal | **Unaffected** — dynamic code compilation is independent of the calling script's source. |
| Debugging/inspection | Full | Limited — inspector's `Debugger.getScriptSource` returns empty string. Source-mapped debugging still works if `.map` files are provided. |

#### Spec compliance: `Function.prototype.toString()`

The ECMAScript specification (ES2024 §20.2.4.5) provides the `HostHasSourceTextAvailable(func)` host hook. When this hook returns `false`, `toString()` is permitted to return the `"function () { [native code] }"` format even for user-defined functions. Node.js implements this hook to return `false` for functions from sourceless scripts.

V8 already handles this correctly: `JSFunction::ToString` (`deps/v8/src/objects/js-function.cc:1432`) returns the native code fallback when `!shared_info->HasSourceCode()`. This is **spec-compliant behavior**, not a spec deviation.

Libraries that rely on `Function.prototype.toString()` for serialization or reflection (e.g., dependency injection frameworks that parse function signatures) will see `[native code]` instead of source. This is an inherent limitation of sourceless mode and is documented as a tradeoff.

### Tradeoffs of sourceless mode

**Eager compilation is required, not optional.** In normal mode, V8 lazily compiles functions on first call — the source is available to compile from at any time. In sourceless mode, the source is absent at runtime, so every function must be compiled to bytecode upfront at build time. This means:

- **Bytecode blobs are larger** than a code cache produced with lazy compilation, because they contain bytecode for *all* functions, including rarely-used ones.
- **Deserialization at startup loads all bytecode into memory**, though this is still faster than parsing + compiling source, because V8's bytecode deserialization is a memcpy-like operation with no parsing or AST construction.

For the target use cases (SEA, packaged apps, edge deployments), the net effect is positive: the elimination of parse time dominates the cost of deserializing extra bytecode.

**Bytecode is never garbage-collected.** V8 normally flushes bytecode for infrequently-used functions and regenerates it from source when needed. In sourceless mode, there is no source to regenerate from, so bytecode must remain in memory for the process lifetime. For long-running servers with large codebases, this increases baseline memory usage. For the target use cases (packaged applications, short-lived CLI tools, edge functions), this is an acceptable tradeoff.

### Error semantics

| Condition | Behavior |
|-----------|----------|
| `sourceless: true` without `--experimental-vm-sourceless` | Throws `ERR_VM_SOURCELESS_MISSING_FLAG` |
| `sourceless: true` with `cachedData` from wrong V8 version | Sets `cachedDataRejected = true`. V8 version hash mismatch is caught during deserialization. |
| `sourceless: true` with `cachedData` and empty source (load mode) | Normal execution path — this is the intended use case. |
| `sourceless: true` with source and `cachedData` both present | **Build mode**: the source is compiled eagerly, the provided `cachedData` is ignored, and `createCachedData()` produces fresh bytecode. The presence of non-empty source always selects build mode regardless of `cachedData`. |
| `sourceless: true` without `cachedData` and without source (`''`) | Throws `ERR_VM_SOURCELESS_NO_DATA` — there is nothing to execute. |
| `sourceless: true` with structurally corrupted bytecode that passes header checks | **Undefined behavior** — same as with existing `vm.Script({ cachedData })` today. V8 does not validate bytecode instructions (see [Security Analysis §1](#1-bytecode-could-crash-v8-or-enable-memory-corruption)). This is not a regression; it is pre-existing V8 behavior. |

## Security Analysis

Security is the core concern with this proposal. Here is a thorough analysis of every angle we anticipate the TSC evaluating.

### 1. "Bytecode could crash V8 or enable memory corruption"

**This concern applies equally to the existing code cache feature.**

V8's code cache (`vm.Script({ cachedData })`) already contains serialized bytecode. V8 does not structurally verify the bytecode instructions in a code cache before execution — unlike the JVM's bytecode verifier or .NET's IL verification. The validation that V8 performs on code cache is limited to header checks:

- Magic number
- V8 version hash
- Source length (not content hash)
- Compiler flags hash
- Payload checksum (when `--verify-snapshot-checksum` is enabled; defaults to `true` in debug builds, `false` in release builds)

**None of these checks validate the bytecode instructions themselves.** A crafted code cache with valid headers but corrupted bytecode would pass all sanity checks and execute in today's Node.js — no patches needed.

The only sanity check that sourceless mode relaxes is the source-length hash. This check validates `length`, not content — an attacker who controls the `cachedData` buffer can trivially match any expected source length by passing a string of the right length. **The source-length check is not a security boundary**, and removing it does not meaningfully reduce V8's security posture.

### 2. "Malware could use this to evade source-level scanning"

This is a valid concern, but it's important to understand the current state:

- **Bytecode evasion already happens today.** Check Point Research [documented](https://research.checkpoint.com/2024/exploring-compiled-v8-javascript-usage-in-malware/) thousands of malicious compiled V8 applications in the wild (RATs, stealers, miners), all using the existing `bytenode` package or similar techniques. This proposal doesn't enable anything new — it provides a supported path for what the ecosystem already does.

- **The flag is opt-in and visible.** `--experimental-vm-sourceless` must be explicitly passed at the command line. Security scanners can check for this flag in launch scripts, just as they can check for `--allow-child-process` or `--allow-addons`. Package managers could warn when dependencies require this flag.

- **Bytecode is not opaque.** Tools like [View8](https://nicolo.dev/en/blog/view8-v8-bytecode-decompiler/) can decompile V8 bytecode back to readable JavaScript. Security vendors are already building detection capabilities for `.jsc` files.

- **Source ≠ safety.** Obfuscated JavaScript (webpack bundles, terser output, eval chains) is already effectively unreadable to humans and static analyzers. Source availability is a convenience for auditing, not a security guarantee.

### 3. "This relaxes V8's sanity checks"

Sourceless mode relaxes two specific checks:

| Check | Why relaxed | Impact |
|-------|------------|--------|
| Source hash | Source is absent at runtime | Low — only validates source length, not content. An attacker who controls the `cachedData` buffer can trivially satisfy this check by providing a source string of the right length. Not a security boundary. |
| Flags hash | Bytecode compiled with `lazy=false` but consumed with normal flags | Low — V8 version hash still enforced. Flag mismatch causes functional issues (potential crashes), not security exploits. |

The following checks **remain enforced** (unchanged from normal `cachedData` behavior):
- **V8 version hash** — bytecode from a different Node.js version is rejected
- **Read-only snapshot checksum** — ensures V8 heap layout consistency
- **Payload checksum** — validates bytecode integrity when `--verify-snapshot-checksum` is enabled (on by default in debug builds; off in release builds). This is the same behavior as existing `cachedData` — sourceless mode does not weaken it further

### 4. "What if V8 adds a bytecode verifier later?"

This proposal is designed to be forward-compatible with a future V8 bytecode verifier. If V8 adds structural validation of bytecode instructions, sourceless mode would automatically benefit from it since it uses the same deserialization path as code cache.

### 5. Comparison with existing trusted-bytecode features

| Feature | Ships bytecode? | Source required? | Bytecode verified? | Already in Node.js? |
|---------|----------------|-----------------|-------------------|---------------------|
| `vm.Script({ cachedData })` | Yes | Yes (length-checked) | No | Yes |
| SEA `useCodeCache` | Yes | Yes (embedded) | No | Yes |
| SEA `useSnapshot` | Yes (as heap snapshot) | No | No | Yes |
| **Proposed `sourceless`** | **Yes** | **No** | **No** | **This proposal** |

The security posture of `sourceless` mode is comparable to SEA's `useSnapshot` — both execute pre-compiled artifacts without source. The difference is that `useSnapshot` captures the entire heap state while `sourceless` operates at the script level. A notable distinction: SEA snapshots are typically built locally from trusted source, while sourceless bytecode files could be distributed as standalone `.jsc` files. However, both require the same level of trust in the artifact — running an untrusted SEA binary is no safer than running untrusted bytecode.

### 6. Interaction with the Permission Model

Node.js's `--permission` flag provides granular access control (`--allow-fs-read`, `--allow-addons`, etc.). This proposal does **not** introduce a new permission.

Rationale: loading bytecode via `vm.Script` requires reading a file (gated by `--allow-fs-read`) and executing it in V8 (which `vm.Script` already does with source code). No new capability is introduced that isn't already gated. The `--experimental-vm-sourceless` flag itself acts as an additional gate — the feature cannot be activated without it, regardless of permission model settings.

If the TSC prefers a dedicated permission (e.g., `--allow-vm-sourceless`), the implementation can accommodate this. We propose starting without one and adding it if real-world usage demonstrates a need for finer-grained control.

### 7. Mitigations

To minimize risk, this proposal includes:

- **Explicit opt-in flag** (`--experimental-vm-sourceless`) — cannot be triggered accidentally
- **Same-version enforcement** — V8 version hash check is preserved; bytecode from a different Node.js version is rejected
- **Process-level warning** — similar to `--experimental-vm-modules`, emits a warning on first use
- **No `require()` / `import` integration** — bytecode can only be loaded via `vm.Script`, `vm.compileFunction()`, and `vm.SourceTextModule` APIs. Packagers wire these into module loading via existing hooks (e.g., `module.register()`). This keeps the feature scoped and prevents accidental bytecode loading from `node_modules`
- **Documentation** — clear documentation that bytecode is version-locked, architecture-specific, and not a security boundary

## Implementation Plan

The implementation touches V8 internals and Node.js's `vm` module. Here's the scope:

### V8 changes (~60 lines across 7 files)

These changes are applied directly to the V8 source in `deps/v8/`, similar to existing cherry-picks and backports. Unlike those, these are *feature additions* rather than bug fixes, which represents a higher maintenance commitment. To minimize rebase friction across V8 upgrades, all changes are guarded behind per-isolate runtime checks and scoped to narrow, well-defined code paths. See [Maintenance burden](#maintenance-burden-of-v8-changes) below.

1. **`v8-isolate.h` / `api.cc`** — Add per-isolate sourceless mode control:
   - `Isolate::SetSourcelessMode(bool enabled)` — sets per-isolate `lazy=false` for eager compilation. Only `lazy` is toggled; the `predictable` flag is **not** set, avoiding its unrelated side effects (single-threaded GC, disabled concurrent recompilation, disabled memory reducer). The call sequence `SetSourcelessMode(true)` → compile → `SetSourcelessMode(false)` is atomic from JavaScript's perspective because `vm.Script` construction is synchronous and each isolate runs on a single JS thread.
   - `Isolate::FixSourcelessScript(Local<UnboundScript> script)` — replaces source with `undefined` after bytecode deserialization

2. **`objects.cc`** — `Script::SetSource()` treats empty string as signal to store `undefined`

3. **`parsing.cc`** — Two guards in `ParseProgram()` and `ParseFunction()` to return early when source is `undefined` (prevents crash from parsing non-existent source)

4. **`marking-visitor-inl.h`** — Prevent GC from flushing bytecode for sourceless scripts (bytecode cannot be regenerated from source)

5. **`compiler.cc`** — Bypass compilation cache for empty source to force the code-cache consumption path

6. **`code-serializer.cc`** — Skip source-hash and flags-hash checks when deserializing sourceless bytecode (version hash still enforced; payload checksum behavior unchanged)

7. **`js-function.cc`** — `Function.prototype.toString()` fallback for classes in sourceless scripts (returns `"class {}"`)

### Maintenance burden of V8 changes

Previous attempts to upstream these changes to V8 ([#26026](https://github.com/nodejs/node/issues/26026)) stalled — the V8 team has expressed reluctance to support external bytecode consumption. We acknowledge this means Node.js would carry these V8 modifications for the foreseeable future, with rebase work on each V8 upgrade.

To minimize this cost:
- All changes are guarded behind per-isolate runtime checks, so they don't affect V8's default code paths
- The total diff is ~60 lines across 7 files — small enough to rebase manually in minutes
- The `pkg` project has maintained equivalent changes across 20+ Node.js versions over 5 years, demonstrating that the rebase burden is manageable in practice
- If V8 upstream ever adds native sourceless support, the Node.js-specific changes can be replaced with upstream APIs

**Security-critical V8 upgrades:** If a V8 security update conflicts with the sourceless changes, the security update takes priority. The sourceless feature can be temporarily disabled (the experimental flag makes this low-impact) while the changes are rebased. This is the same approach used for any Node.js-specific V8 modification.

### Node.js changes (~60 lines across 4 files)

1. **`lib/vm.js`** — Accept `sourceless` option in `vm.Script` constructor and `vm.compileFunction()`, validate flag presence, pass to C++ layer

2. **`lib/internal/vm/module.js`** — Add `sourceless` option to `vm.SourceTextModule`. The `cachedData` and `createCachedData()` support already exists — only the sourceless mode logic needs to be added.

3. **`src/node_contextify.cc`** — Orchestrate the per-isolate enable/disable/fix sequence around compilation and deserialization

4. **`lib/internal/errors.js`** — Define `ERR_VM_SOURCELESS_MISSING_FLAG` and `ERR_VM_SOURCELESS_NO_DATA` error codes

### Tests

- Unit tests for `vm.Script` with `sourceless: true` (compile, serialize, deserialize, execute)
- Unit tests for `vm.compileFunction()` with `sourceless: true`
- Unit tests for `vm.SourceTextModule` with `sourceless: true` (including `link()` + `evaluate()` flow)
- Test that `vm.SourceTextModule` with `sourceless: true` produces and consumes valid bytecode via `createCachedData()`
- Test that bytecode from a different dummy source works (source hash relaxed)
- Test that `Function.prototype.toString()` returns expected fallback
- Test that bytecode GC flushing is prevented
- Test that `cachedDataRejected` is set correctly for valid and invalid bytecode
- Test that the feature requires `--experimental-vm-sourceless`
- Test that cross-version bytecode is still rejected (version hash enforced)
- Test error when `sourceless: true` is passed with empty source and no `cachedData`
- Test that `sourceless: true` with both source and `cachedData` compiles from source (build mode)
- Test interaction with worker threads (per-isolate state isolation)
- Test that `eval()` and `new Function()` work inside sourceless scripts
- Test source maps with sourceless scripts (`--enable-source-maps`)
- Test inspector protocol returns empty source for sourceless scripts

### Documentation

- `doc/api/vm.md` — document `sourceless` option on `vm.Script`, `vm.compileFunction()`, and `vm.SourceTextModule`, including tradeoffs and mode definitions
- `doc/api/cli.md` — document `--experimental-vm-sourceless` flag

## Prior Art and Discussion

- **Node.js Issue [#11842](https://github.com/nodejs/node/issues/11842)** (2017) — Yang Guo (V8 team) discussed shipping bytecode via `vm.Script`. Identified version-locking and `toString()` issues but no proposal materialized.
- **Node.js Issue [#26026](https://github.com/nodejs/node/issues/26026)** (2019) — Feature request for sourceless `vm.Script`. Ben Noordhuis directed changes upstream to V8. The attempt stalled due to V8's contribution process friction. Closed as stale in 2022.
- **SEA `useCodeCache`** ([PR #48191](https://github.com/nodejs/node/pull/48191), 2023) — Added code cache to SEA, establishing the precedent for shipping pre-compiled bytecode in Node.js.
- **[bytenode](https://github.com/bytenode/bytenode)** — ~27k weekly downloads, ~3k stars. Fragile userland implementation of the same concept, demonstrating persistent ecosystem demand.
- **[@yao-pkg/pkg](https://github.com/yao-pkg/pkg)** — ~136K weekly downloads. Maintains custom V8 modifications per Node.js version to enable bytecode execution. This proposal would eliminate that maintenance burden entirely. Upstream tracking: [yao-pkg/pkg#231](https://github.com/yao-pkg/pkg/issues/231).

## FAQ

### Q: Will bytecode work across Node.js versions?

No. V8's bytecode format is internal and changes between versions. The V8 version hash in the bytecode header is checked and mismatches are rejected. Users must recompile bytecode when upgrading Node.js, just as they must rebuild SEA binaries.

### Q: Will bytecode work across architectures (x64 vs ARM64)?

No. Bytecode contains architecture-specific details (pointer sizes, alignment). Cross-architecture execution is not supported and will be rejected by V8's sanity checks.

### Q: Does this protect source code from reverse engineering?

No. V8 bytecode can be decompiled to readable JavaScript using freely available tools like [View8](https://nicolo.dev/en/blog/view8-v8-bytecode-decompiler/). This feature is about **deployment optimization** (smaller bundles, no parse overhead, single-artifact distribution), not code protection. The relationship to source is analogous to Java `.class` files — a compiled deployment format, not a security boundary.

### Q: Why not just use SEA with `useSnapshot`?

SEA snapshots capture the entire isolate heap state, which is a heavier mechanism. Sourceless `vm.Script` operates at the individual script level and integrates with the existing `vm` API, making it suitable for tools that need fine-grained control over script loading (packagers, bundlers, module loaders).

### Q: Could this be used for malicious purposes?

The capability already exists in the ecosystem via `bytenode` and similar tools. Check Point Research has [documented](https://research.checkpoint.com/2024/exploring-compiled-v8-javascript-usage-in-malware/) thousands of malicious compiled V8 applications in the wild. This proposal moves bytecode execution into core where it can be properly maintained, gated behind a flag, documented, and subject to Node.js's security processes. Pushing this functionality to userland hasn't prevented misuse — it has only made the legitimate use case harder and more fragile.

### Q: Why not contribute this to V8 directly?

Previous attempts to upstream sourceless support to V8 (referenced in [#26026](https://github.com/nodejs/node/issues/26026)) stalled. The V8 team has expressed reluctance to stabilize the bytecode format or add features for external bytecode consumption. The changes required are minimal (~60 lines) and scoped behind per-isolate conditionals that don't affect V8's default behavior. Carrying these as Node.js-specific V8 changes is a conscious maintenance tradeoff — see [Maintenance burden of V8 changes](#maintenance-burden-of-v8-changes) for the full analysis.

### Q: What about worker threads?

Sourceless mode state is per-isolate, not per-process. Each worker thread has its own V8 isolate and its own sourceless mode flag. Compiling or loading sourceless bytecode in one worker does not affect any other worker or the main thread.

### Q: Does `eval()` / `new Function()` work inside sourceless scripts?

Yes. Dynamic code compilation (`eval()`, `new Function()`) parses and compiles the provided string argument independently of the calling script's source. A sourceless script can call `eval('1 + 1')` and it works normally — the eval'd code is compiled from its own source string.

### Q: Can I debug sourceless applications?

Partially. The Chrome DevTools inspector reports empty source for sourceless scripts (`Debugger.getScriptSource` returns `''`). However, **source maps are fully supported** — they are stored in `ScriptOrigin` independently of source code. Shipping `.jsc` files alongside `.map` files enables source-mapped stack traces and debugger navigation.

## Graduation Path

This feature starts as experimental (`--experimental-vm-sourceless`). The conditions for graduation to stable are:

1. **At least 2 major Node.js versions** with the experimental flag, with no breaking changes to the API surface
2. **Real-world adoption** by at least one major packager (`pkg`, `bytenode`, or equivalent) confirming the API meets production needs
3. **V8 change stability** demonstrated across at least 2 V8 major version upgrades without requiring significant rework
4. **No unresolved security concerns** raised by the Node.js security team during the experimental period
5. **TSC consensus** that the maintenance burden is acceptable given adoption levels

If V8 upstream adds native sourceless support during the experimental period, the Node.js-specific changes can be replaced with upstream APIs, simplifying long-term maintenance.

## Open Questions

1. **Flag naming:** `--experimental-vm-sourceless` vs `--experimental-bytecode` vs another name?
2. **Should SEA's `useCodeCache` gain a `sourceless` variant?** This would allow SEA to embed only bytecode, roughly halving the embedded payload size. This could be a natural follow-up once the base `vm` support lands.

---

/cc @joyeecheung (SEA, compile cache, vm module), @mcollina (TSC), @jasnell (TSC, vm module), @targos (V8 upgrades), @RaisinTen (SEA code cache)

Mode	Inputs	Behavior
Build mode	`sourceless: true` + source code (string)	Compiles all functions eagerly (V8's `lazy` flag set to `false`) so the resulting code cache is self-contained — every function has bytecode, not just the ones called during construction. `createCachedData()` then produces a complete bytecode blob. Without `sourceless: true`, `createCachedData()` only caches functions that V8 has compiled so far (top-level code and functions already called), leaving unvisited inner functions without bytecode — making the cache incomplete for sourceless consumption.
Load mode	`sourceless: true` + `cachedData` (Buffer) + empty source (`''`)	Deserializes and executes bytecode. Source-hash and flags-hash checks are relaxed.

Condition	Behavior
`sourceless: true` without `--experimental-vm-sourceless`	Throws `ERR_VM_SOURCELESS_MISSING_FLAG`
`sourceless: true` with `cachedData` from wrong V8 version	Sets `cachedDataRejected = true`. V8 version hash mismatch is caught during deserialization.
`sourceless: true` with `cachedData` and empty source (load mode)	Normal execution path — this is the intended use case.
`sourceless: true` with source and `cachedData` both present	Build mode: the source is compiled eagerly, the provided `cachedData` is ignored, and `createCachedData()` produces fresh bytecode. The presence of non-empty source always selects build mode regardless of `cachedData`.
`sourceless: true` without `cachedData` and without source (`''`)	Throws `ERR_VM_SOURCELESS_NO_DATA` — there is nothing to execute.
`sourceless: true` with structurally corrupted bytecode that passes header checks	Undefined behavior — same as with existing `vm.Script({ cachedData })` today. V8 does not validate bytecode instructions (see Security Analysis §1). This is not a regression; it is pre-existing V8 behavior.

Aspect	Normal mode	Sourceless mode
Source required	Yes	Only at compile time (build mode)
`Function.prototype.toString()`	Returns source text	Returns `"function () { [native code] }"` (or `"class {}"` for classes)
Stack traces	Full source positions	Line/column numbers preserved, no source preview
Source maps	Stored in `ScriptOrigin`	Still functional — source maps are independent of source code. Ship `.jsc` + `.map` for production debugging.
Code cache validation	Source hash + flags hash + version hash (+ payload checksum in debug builds)	Flags hash relaxed, source hash relaxed. Version hash preserved. Payload checksum behavior unchanged from normal mode.
Lazy compilation	Enabled	Disabled (all functions eagerly compiled in build mode)
Bytecode GC flushing	Enabled	Disabled (bytecode cannot be regenerated)
`eval()` / `new Function()`	Normal	Unaffected — dynamic code compilation is independent of the calling script's source.
Debugging/inspection	Full	Limited — inspector's `Debugger.getScriptSource` returns empty string. Source-mapped debugging still works if `.map` files are provided.

Check	Why relaxed	Impact
Source hash	Source is absent at runtime	Low — only validates source length, not content. An attacker who controls the `cachedData` buffer can trivially satisfy this check by providing a source string of the right length. Not a security boundary.
Flags hash	Bytecode compiled with `lazy=false` but consumed with normal flags	Low — V8 version hash still enforced. Flag mismatch causes functional issues (potential crashes), not security exploits.

Feature	Ships bytecode?	Source required?	Bytecode verified?	Already in Node.js?
`vm.Script({ cachedData })`	Yes	Yes (length-checked)	No	Yes
SEA `useCodeCache`	Yes	Yes (embedded)	No	Yes
SEA `useSnapshot`	Yes (as heap snapshot)	No	No	Yes
Proposed `sourceless`	Yes	No	No	This proposal

Uh oh!

proposal: bytecode-only script execution via vm module (sourceless mode) #62670

Description

Summary

Motivation

The problem today

The ecosystem is already doing this — and growing fast

What we're NOT proposing

Proposed API

CLI flag

Operating modes

vm.Script changes

vm.compileFunction() changes

vm.SourceTextModule changes

API design alternative considered

Behavioral changes under sourceless: true

Spec compliance: Function.prototype.toString()

Tradeoffs of sourceless mode

Error semantics

Security Analysis

1. "Bytecode could crash V8 or enable memory corruption"

2. "Malware could use this to evade source-level scanning"

3. "This relaxes V8's sanity checks"

4. "What if V8 adds a bytecode verifier later?"

5. Comparison with existing trusted-bytecode features

6. Interaction with the Permission Model

7. Mitigations

Implementation Plan

V8 changes (~60 lines across 7 files)

Maintenance burden of V8 changes

Node.js changes (~60 lines across 4 files)

Tests

Documentation

Prior Art and Discussion

FAQ

Q: Will bytecode work across Node.js versions?

Q: Will bytecode work across architectures (x64 vs ARM64)?

Q: Does this protect source code from reverse engineering?

Q: Why not just use SEA with useSnapshot?

Q: Could this be used for malicious purposes?

Q: Why not contribute this to V8 directly?

Q: What about worker threads?

Q: Does eval() / new Function() work inside sourceless scripts?

Q: Can I debug sourceless applications?

Graduation Path

Open Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`vm.Script` changes

`vm.compileFunction()` changes

`vm.SourceTextModule` changes

Behavioral changes under `sourceless: true`

Spec compliance: `Function.prototype.toString()`

Q: Why not just use SEA with `useSnapshot`?

Q: Does `eval()` / `new Function()` work inside sourceless scripts?