feat: vLLM (in-process) backend never populates mot.usage

## Summary

The in-process vLLM backend (`mellea/backends/vllm.py`) never sets `mot.usage`, so callers always receive `None` for token counts regardless of whether the generation succeeded.

## Affected code

`VLLMBackend.post_processing` records tool calls, the generate log, and telemetry metadata, but contains no usage-population step.

The `processing` method accumulates only the decoded text from `vllm.RequestOutput.outputs[0].text`; the token ID arrays are discarded.

## How other backends handle this

Every other backend that can compute token counts does so unconditionally in its post-processing step:

| Backend | Source of counts |
|---|---|
| HuggingFace | `GenerateDecoderOnlyOutput.sequences` shape |
| OpenAI / LiteLLM | `usage` field in API response |
| Ollama | `prompt_eval_count` / `eval_count` in response |
| WatsonX | `usage` field in API response |

`vllm.RequestOutput` exposes both `prompt_token_ids` and `outputs[0].token_ids`, so counts can be derived without any extra API call.

## Expected behaviour

`mot.usage` should be set to `{"prompt_tokens": N, "completion_tokens": M, "total_tokens": N+M}` after every successful vLLM generation, consistent with other backends.

## Notes

- `generate_from_raw` (batch path, line ~462) also does not set usage — same fix needed there.
- Discovered while auditing usage consistency across backends following fix for #694 (HuggingFace usage regression).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: vLLM (in-process) backend never populates mot.usage #696

Summary

Affected code

How other backends handle this

Expected behaviour

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Backend	Source of counts
HuggingFace	`GenerateDecoderOnlyOutput.sequences` shape
OpenAI / LiteLLM	`usage` field in API response
Ollama	`prompt_eval_count` / `eval_count` in response
WatsonX	`usage` field in API response

feat: vLLM (in-process) backend never populates mot.usage #696

Description

Summary

Affected code

How other backends handle this

Expected behaviour

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions