feat(gemini): add GeminiDenseEmbedder text embedding provider by chethanuk · Pull Request #751 · volcengine/OpenViking

chethanuk · 2026-03-18T23:21:53Z

Description

Phase 1 of #566: Adds GeminiDenseEmbedder — a production-ready, opt-in text-only dense embedding provider backed by Google's google-genai SDK. Volcengine remains the default; Gemini is enabled via provider: "gemini" in ov.conf. No existing behaviour changes.

This PR implements Phase 1 (text-only). Phase 2 (multimodal embedding) is tracked in #566 — the issue remains open.

Phased Implementation

Phase	Scope	Status
Phase 1 (this PR)	Text-only dense embedding: 8 task types, MRL 1-3072, SDK retry, async batching, non-symmetric routing via #702 pattern	✅
Phase 2 (future)	Multimodal embedding: image/audio/video via Parts API, cross-modal retrieval, align with #718 approach	Planned

Key capabilities

Text-only — supports_multimodal = False; multimodal is Phase 2
Non-symmetric routing — get_query_embedder() / get_document_embedder() route query_param/document_param to Gemini task types via factory (follows feat(embedding): combine document embedder and query embedder to avoi… #702 pattern)
Per-call task_type + title — override at call time; all 8 Gemini task types
Case-insensitive config — task_type, query_param, document_param auto-uppercased in config normalization
SDK-native retry — HttpRetryOptions(attempts=3, exp_base=2, max_delay=30s) with import guard for SDK < 0.8
Async concurrent batching — async_embed_batch() dispatches 100-text chunks in parallel via anyio
Empty-text guard — returns zero-vectors with warning log; preserves index alignment

flowchart LR
    subgraph Config["ov.conf / EmbeddingModelConfig"]
        CFG["provider: gemini\napi_key: sk-…\ntask_type: RETRIEVAL_DOCUMENT\ndimension: 1536"]
    end

    subgraph Embedder["GeminiDenseEmbedder(DenseEmbedderBase)"]
        direction TB
        BC["_build_config(task_type?, title?)"]
        E["embed(text, task_type?, title?)"]
        EB["embed_batch(texts, titles?)"]
        AEB["async_embed_batch(texts)"]
        E --> BC
        EB --> BC
        AEB --> BC
    end

    subgraph SDK["google-genai SDK"]
        direction TB
        HTTP["HttpOptions\n(retry: 3× exp backoff)"]
        SYNC["models.embed_content()"]
        ASYNC["aio.models.embed_content()\n(semaphore-bounded)"]
        HTTP --> SYNC
        HTTP --> ASYNC
    end

    Config --> Embedder
    BC --> SYNC
    AEB --> ASYNC

Related Issue

Partial implementation of #566 (Phase 1: text-only dense embedding; Phase 2: multimodal — issue stays open)

Type of Change

Changes Made

openviking/models/embedder/gemini_embedders.py — GeminiDenseEmbedder(DenseEmbedderBase): _build_config(), per-call task_type/title, SDK retry, __repr__, empty-text guard with warning log
openviking_cli/utils/config/embedding_config.py — registers "gemini" provider; validates + auto-uppercases task_type/query_param/document_param; factory routes context for non-symmetric mode
openviking/models/embedder/__init__.py — exports GeminiDenseEmbedder; annotation: "text-only; multimodal planned"
examples/ov.conf.example — Gemini config snippet
pyproject.toml — google-genai>=0.8 dep; optional gemini-async extra for anyio
tests/unit/test_gemini_embedder.py — 41 unit tests (mock-only, no API key needed)
tests/unit/test_embedding_config_gemini.py — 18 config validation tests (incl. case-insensitive task_type)
tests/integration/ — 49 integration tests (auto-skip without GOOGLE_API_KEY)

Testing

Suite	Tests	Live key required
`test_gemini_embedder.py`	41	No
`test_embedding_config_gemini.py`	18	No
`test_gemini_e2e.py`	7	Yes
`test_gemini_embedding_it.py`	22	Yes
`test_gemini_openviking_it.py`	20	Yes

108 passed, 0 skipped  ← with GOOGLE_API_KEY
 59 passed, 49 skipped ← without GOOGLE_API_KEY (CI)

Checklist

Code follows project coding style (ruff + black formatted)
Self-review completed
Tests added and passing
Tested on Linux and macOS

chethanuk · 2026-03-18T23:27:01Z

@qin-ctx Please review and merge this

ZaynJarvis · 2026-03-19T03:50:55Z

#702 is megred for query and document embedding support.

get_query_embedder() is not the expected usage.

btw check if you can improve based on #718

qin-ptr

Review Summary

This PR adds Google Gemini as a text embedding provider with excellent test coverage (107 tests) and solid error handling. However, there are three blocking issues that need to be addressed:

Design mismatch: Issue #566 explicitly requests multimodal retrieval support, but this PR only implements text-only embedding (supports_multimodal = False). The PR description should clarify this is a phased implementation with a roadmap for multimodal support, or change the issue reference to indicate partial completion.
Documentation inconsistency: Code comment claims "multimodal" but implementation is text-only.
CI failure: lint job failed due to formatting issues in test_gemini_embedder.py.

Once these are resolved, the PR will be ready to merge. The implementation quality is strong, with comprehensive testing, graceful error handling, and proper SDK retry integration.

🤖 I am a bot owned by @qin-ctx.

qin-ptr · 2026-03-19T05:32:36Z

openviking/models/embedder/__init__.py

 - OpenAI: Dense only
 - Volcengine: Dense, Sparse, Hybrid
 - Jina AI: Dense only
+- Google Gemini: Dense only (multimodal)


[Bug] (blocking)

Comment says "Google Gemini: Dense only (multimodal)" but the actual implementation has GeminiDenseEmbedder.supports_multimodal = False. This is contradictory.

Should either:

Change to "Google Gemini: Dense only (text-only; multimodal planned)"

Or remove the "(multimodal)" annotation entirely until multimodal support is actually implemented

qin-ptr · 2026-03-19T05:32:36Z

openviking/models/embedder/gemini_embedders.py

+    # text-embedding-004:         768  fixed-dim legacy model, does not support MRL truncation
+    # Future gemini-embedding-*:  default 3072 via _default_dimension() fallback
+    # Future text-embedding-*:    default 768  via _default_dimension() prefix rule
+    supports_multimodal: bool = False  # text-only; multimodal planned separately


[Design] (blocking)

This PR claims to close issue #566, which explicitly requests "first-class Gemini Embedding 2 support for multimodal retrieval". The issue emphasizes:

"it should not reduce everything to plain text before embedding"

"multimodal inputs handled as multimodal, not flattened by default"

However, this implementation sets supports_multimodal = False and only handles text. This is a fundamental mismatch between the issue requirement and the PR implementation.

Recommendation: Either:

Update the PR description to clarify this is phase 1 (text-only) with a roadmap for phase 2 (multimodal), and change the issue reference to "Partial implementation of [Feature]: Add first-class Gemini Embedding 2 support for multimodal retrieval #566", keeping the issue open

Or implement multimodal support in this PR as originally requested by the issue

The current "Closes: #566" statement is misleading and will cause users to expect multimodal functionality that doesn't exist.

qin-ptr · 2026-03-19T05:32:36Z

tests/unit/test_gemini_embedder.py

@@ -0,0 +1,479 @@
+# Copyright (c) 2026 Beijing Volcano Engine Technology Co., Ltd.


[Bug] (blocking)

CI lint job failed because this file needs formatting:

Would reformat: tests/unit/test_gemini_embedder.py

Please run black tests/unit/test_gemini_embedder.py and commit the formatted version.

qin-ptr · 2026-03-19T05:32:36Z

openviking_cli/utils/config/embedding_config.py

            if backend is not None and provider is None:
                data["provider"] = backend
-            for key in ("query_param", "document_param"):
+            for key in ("input_type",):


[Suggestion] (non-blocking)

The for key in ("input_type",): loop normalizes input_type to lowercase, but the new Gemini provider uses task_type instead of input_type. Consider also normalizing task_type here, or document that task_type must be uppercase (which is currently enforced in the Gemini validator at line 176).

qin-ptr · 2026-03-19T05:32:36Z

openviking/models/embedder/gemini_embedders.py

+        task_type: Optional[str] = None,
+        title: Optional[str] = None,
+    ) -> EmbedResult:
+        if not text or not text.strip():


[Suggestion] (non-blocking)

Returning a zero vector for empty text is a reasonable fallback, but it happens silently. Consider logging a warning so callers know their empty input was handled specially:

if not text or not text.strip(): logger.warning("Empty text passed to embed(), returning zero vector") return EmbedResult(dense_vector=[0.0] * self._dimension)

This helps with debugging when unexpected empty strings appear in production.

…fig normalization, empty-text warning - Fix "(multimodal)" annotation to "(text-only; multimodal planned)" to match supports_multimodal=False (B1) - Run black formatter on test_gemini_embedder.py (B3) - Add auto-uppercase normalization for task_type/query_param/document_param in sync_provider_backend so lowercase config values pass validation (NB1) - Add logger.warning on empty text in embed() for production debuggability (NB2) - Add test_gemini_task_type_case_insensitive test (NB1) - Fix ruff import sorting in __init__.py

chethanuk · 2026-03-19T08:59:35Z

Thanks for the thorough review! All issues addressed in commit bcfd612:

@qin-ptr — blocking fixes

B1 __init__.py:15: Fixed — annotation now reads "Dense only (text-only; multimodal planned)"
B2 Design mismatch: Reframed PR as Phase 1 (text-only). Changed "Closes: [Feature]: Add first-class Gemini Embedding 2 support for multimodal retrieval #566" → "Partial implementation of [Feature]: Add first-class Gemini Embedding 2 support for multimodal retrieval #566". Issue stays open for Phase 2 (multimodal).
B3 Lint: Ran black on test_gemini_embedder.py

@qin-ptr — non-blocking (adopted both)

NB1 task_type normalization: Added auto-uppercase in sync_provider_backend for task_type, query_param, document_param. Users can now write task_type: "retrieval_document" (any case) in config. Added test test_gemini_task_type_case_insensitive.
NB2 Empty text warning: Added logger.warning("Empty text passed to embed(), returning zero vector")

@ZaynJarvis — #702 pattern + #718

feat(embedding): combine document embedder and query embedder to avoi… #702 pattern: Already implemented ✅ — factory at embedding_config.py:370-384 does context-based task_type routing via get_query_embedder() / get_document_embedder() / _get_contextual_embedder(). Non-symmetric mode activates when query_param or document_param is set in config.
feat: add native Google/Gemini Embedding 2 support with Parts API #718 alignment: Reviewed. Our SDK-based approach (google-genai) vs feat: add native Google/Gemini Embedding 2 support with Parts API #718's raw REST approach are complementary. Phase 2 multimodal work will incorporate Parts API structure from feat: add native Google/Gemini Embedding 2 support with Parts API #718. If feat: add native Google/Gemini Embedding 2 support with Parts API #718 merges first, we'll consolidate file naming (google_embedders.py vs gemini_embedders.py).

Verification

ruff check: All checks passed!
pytest: 59 passed (41 embedder + 18 config, incl. 1 new normalization test)
black --check: 1 file would be left unchanged

chethanuk · 2026-03-19T19:39:46Z

@qin-ctx Can we merge? Main keeps changing now tests are working

ZaynJarvis · 2026-03-20T03:01:00Z

for `query and document embedding`

    def get_query_embedder(self):
        """Get embedder instance for query embeddings."""
        return self._get_contextual_embedder("query")

    def get_document_embedder(self):
        """Get embedder instance for document/passage embeddings."""
        return self._get_contextual_embedder("document")

    def _get_contextual_embedder(self, context: str):
        if not self.dense:
            return self.get_embedder()

        provider = (self.dense.provider or "").lower()
        if provider == "openai":
            non_symmetric = (
                self.dense.query_param is not None or self.dense.document_param is not None
            )
            effective_context = context if non_symmetric else None
            return self._create_embedder(provider, "dense", self.dense, context=effective_context)

        if provider == "jina":
            return self._create_embedder(provider, "dense", self.dense, context=context)

        if provider == "gemini":
            non_symmetric = (
                self.dense.query_param is not None or self.dense.document_param is not None
            )
            effective_context = context if non_symmetric else None
            return self._create_embedder(provider, "dense", self.dense, context=effective_context)

        return self.get_embedder()

this is removed in purpose in #702 and replaced by

@abstractmethod
    def embed(self, text: str, is_query: bool = False)

please remove the _get_contextual_embedder etc. and use is_query

[note, no need to change] after discovering, gemini embedding 1 supports taskType, gemini embedding 2 ignores taskType in request.

for gemini embedding 2, recommended task type instruction is through prompt instruction, e.g. embedding prompt: "taskType: {taskType}, content: {content}"

others lgtm, if this PR changes 1, i will merge and closes #718

…ngine#702 pattern - GeminiDenseEmbedder: accept query_param/document_param, use is_query in embed() and embed_batch() to select task_type at call time - EmbeddingConfig: add Gemini provider, factory, validation, dimension - No get_query_embedder/get_document_embedder/_get_contextual_embedder (removed in volcengine#702; embed(is_query=True/False) is the pattern) - Tests use embed(text, is_query=True/False) pattern throughout - Rebased onto current upstream/main

chethanuk · 2026-03-20T09:12:29Z

@ZaynJarvis Fixed per #702:

Removed get_query_embedder() / get_document_embedder() / _get_contextual_embedder() + context param — none of these exist in this PR
GeminiDenseEmbedder now accepts query_param / document_param in constructor
embed(text, is_query=True) selects query_param; is_query=False selects document_param — same pattern as OpenAI._build_extra_body(is_query) and Jina._build_extra_body(is_query)
Rebased onto current main (clean single commit on top of upstream/main)
Tests: 58 unit tests pass, integration tests skip cleanly without API key

ZaynJarvis · 2026-03-20T09:57:32Z

pls fix workflow

ZaynJarvis · 2026-03-20T09:59:05Z

openviking_cli/utils/config/embedding_config.py

+            "QUESTION_ANSWERING, FACT_VERIFICATION, CODE_RETRIEVAL_QUERY. "
+            "For non-symmetric mode set query_param/document_param instead."
+        ),
+    )


i don't think task_type is needed, query_param and document_param is supposed to contain task_type.

others lgtm.

github-project-automation bot added this to OpenViking project Mar 18, 2026

github-project-automation bot moved this to Backlog in OpenViking project Mar 18, 2026

chethanuk mentioned this pull request Mar 18, 2026

[Feature]: 支持gemini embedding 2这种多模态向量模型 #695

Open

1 task

qin-ctx requested a review from ZaynJarvis March 19, 2026 03:43

qin-ctx assigned ZaynJarvis Mar 19, 2026

qin-ptr suggested changes Mar 19, 2026

View reviewed changes

chethanuk force-pushed the feat/gemini-landing branch 2 times, most recently from 1586f65 to 893a1da Compare March 19, 2026 13:00

ZaynJarvis mentioned this pull request Mar 20, 2026

feat: add native Google/Gemini Embedding 2 support with Parts API #718

Draft

4 tasks

chethanuk force-pushed the feat/gemini-landing branch from 98f7ae4 to fba32cf Compare March 20, 2026 09:12

ZaynJarvis reviewed Mar 20, 2026

View reviewed changes

		@@ -0,0 +1,479 @@
		# Copyright (c) 2026 Beijing Volcano Engine Technology Co., Ltd.

Conversation

chethanuk commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Phased Implementation

Key capabilities

Related Issue

Type of Change

Changes Made

Testing

Checklist

Uh oh!

chethanuk commented Mar 18, 2026

Uh oh!

ZaynJarvis commented Mar 19, 2026

Uh oh!

qin-ptr left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

qin-ptr Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

qin-ptr Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

qin-ptr Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

qin-ptr Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

qin-ptr Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

chethanuk commented Mar 19, 2026

@qin-ptr — blocking fixes

@qin-ptr — non-blocking (adopted both)

@ZaynJarvis — #702 pattern + #718

Verification

Uh oh!

chethanuk commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ZaynJarvis commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

for query and document embedding

Uh oh!

chethanuk commented Mar 20, 2026

Uh oh!

ZaynJarvis commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ZaynJarvis Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

ZaynJarvis Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chethanuk commented Mar 18, 2026 •

edited

Loading

chethanuk commented Mar 19, 2026 •

edited

Loading

ZaynJarvis commented Mar 20, 2026 •

edited

Loading

for `query and document embedding`

ZaynJarvis commented Mar 20, 2026 •

edited

Loading