[Bug]: Percentage-based minContextLimit / maxContextLimit thresholds calculated against effective input context instead of full model context window

### Bug Description

### Description

When using percentage-based values for `minContextLimit` or `maxContextLimit`, DCP calculates the threshold against the **effective input context** (model context window minus max output tokens) rather than the **full model context window**. This causes reminders to fire significantly earlier than users would reasonably expect.

For example, with a model that has a 200,000 token context window and a 131,072 token max output, DCP receives an effective context of 68,928 tokens. Setting `minContextLimit: "50%"` triggers at 34,464 tokens — only ~17% of the actual model context window.

### Environment

- DCP version: `@tarquinen/opencode-dcp@3.1.11`
- OpenCode: `1.14.41`
- Model: `zai-coding-plan/glm-5-turbo` (context: 200,000, output: 131,072)
- OS: macOS

### Steps to Reproduce

1. Configure a model with a large output limit relative to its context window (e.g., 200K context / 131K output).
2. In `dcp.jsonc`, set `minContextLimit` to `"50%"` (or any percentage).
3. Start a session and send a few messages.
4. Observe that compression reminders begin firing when context usage is well below the expected percentage threshold.

### Expected Behavior

`minContextLimit: "50%"` should resolve to 100,000 tokens (50% of the 200,000 model context window), and reminders should only begin firing after 100,000 tokens of context are used.

### Actual Behavior

`minContextLimit: "50%"` resolves to 34,464 tokens (50% of the 68,928 effective input context), and reminders begin firing at approximately 17% of the real model context window.

### Root Cause

The issue appaers to stem from OpenCode passing the effective input context to DCP's system prompt hook, and DCP using that value as the base for percentage calculations.

**1. OpenCode passes effective input context, not full model context.**

In the OpenCode binary (`.opencode`), the function `Cy` calculates effective context:

```js
function Cy(o) {
    let e = o.model.limit.context;
    if (e === 0) return 0;
    let s = o.cfg.compaction?.reserved ?? Math.min(Xp, je.maxOutputTokens(o.model));
    return o.model.limit.input
        ? Math.max(0, o.model.limit.input - s)
        : Math.max(0, e - je.maxOutputTokens(o.model));
}
```

For `glm-5-turbo`: `200,000 - 131,072 = 68,928`. This effective value is what gets passed to plugins via the `input.model.limit.context` hook parameter.

**2. DCP caches this effective value as `modelContextLimit`.**

In `dist/lib/hooks.js` (line 18-20):

```js
if (input.model?.limit?.context) {
    state.modelContextLimit = input.model.limit.context;
    logger.debug("Cached model context limit", { limit: state.modelContextLimit });
}
```

Confirmed in debug logs:
```
2026-05-09T16:28:45.287Z DEBUG index: Cached model context limit | limit=68928
```

**3. DCP uses the cached value as the base for percentage calculations.**

In `dist/lib/messages/inject/utils.js` (line 49-73), the `resolveContextTokenLimit` function:

```js
function resolveContextTokenLimit(config, state, providerId, modelId, threshold) {
    const parseLimitValue = (limit) => {
        if (typeof limit === "number") {
            return limit;
        }
        if (!limit.endsWith("%") || state.modelContextLimit === undefined) {
            return undefined;
        }
        const parsedPercent = parseFloat(limit.slice(0, -1));
        const roundedPercent = Math.round(parsedPercent);
        const clampedPercent = Math.max(0, Math.min(100, roundedPercent));
        return Math.round((clampedPercent / 100) * state.modelContextLimit);
    };
    // ...
    const globalLimit = threshold === "max"
        ? config.compress.maxContextLimit
        : config.compress.minContextLimit;
    return parseLimitValue(globalLimit);
}
```

Line 66: `return Math.round((clampedPercent / 100) * state.modelContextLimit);` — uses `state.modelContextLimit` (68,928) as the base, not the full model context (200,000).

### Workaround

Use absolute token values instead of percentages. For example, to set a 50% threshold on a 200K context model:

```jsonc
"modelMinLimits": {
    "zai-coding-plan/glm-5-turbo": 100000
}
```

Since absolute values bypass `parseLimitValue`'s percentage calculation (line 52-54), they compare directly against DCP's `currentTokens` count and work as expected regardless of the effective vs. full context discrepancy.

### Suggested Fix

DCP could resolve the full model context window independently (e.g., from the model's reported `limit.context` rather than the hook's `input.model.limit.context`), or OpenCode could pass the full model context window to plugins alongside the effective input context so DCP can choose which base to use for percentage calculations.

Thank you for the excellent plugin — happy to provide any additional debugging info if needed.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Percentage-based minContextLimit / maxContextLimit thresholds calculated against effective input context instead of full model context window #522

Bug Description

Description

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause

Workaround

Suggested Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Percentage-based minContextLimit / maxContextLimit thresholds calculated against effective input context instead of full model context window #522

Description

Bug Description

Description

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause

Workaround

Suggested Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions