Model description
The qwen3 models easily outperform nearly every other open source model for embeddings, however it does not work in infinity due to outdated transformers.
My docker compose file:
version: '3.8'
services:
infinity:
image: michaelf34/infinity:latest
environment:
DO_NOT_TRACK: 1 # Disable telemetry
INFINITY_BETTERTRANSFORMER: True
HF_HOME: /app/data # Use /app/data
INFINITY_MODEL_ID: Qwen/Qwen3-Embedding-4B;Qwen/Qwen3-Reranker-4B # Model(s), semicolon separated
INFINITY_PORT: 7997 # Port
INFINITY_API_KEY: foo # Optional API key
INFINITY_DEVICE: cuda
INFINITY_VECTOR_DISK_CACHE: True
volumes:
- ./infinity:/app/data:rw # Persist /app/data to ./infinity in the current directory
- ./models:/data/models:ro # Mount local models to /data/models in read-only mode
- ./infinity:/data/hf_cache:rw # Mount cache data to /data/hf_cache in read-write mode
ports:
- "7997:7997" # Flexible port mapping
command: v2
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
volumes:
infinity: # Named volume declaration
This results in the error:
infinity-1 | ValueError: The checkpoint you are trying to load has model type `qwen3` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
To fix this, you should just be able to update transformers to >4.51.0 as per the qwen documentation.
Open source status & huggingface transformers.
Model description
The qwen3 models easily outperform nearly every other open source model for embeddings, however it does not work in infinity due to outdated transformers.
My docker compose file:
This results in the error:
To fix this, you should just be able to update transformers to >4.51.0 as per the qwen documentation.
Open source status & huggingface transformers.
pip install infinity_emb[all] --upgrade