feat: add support for langchain embedding#157
feat: add support for langchain embedding#157minimAluminiumalism wants to merge 7 commits intoalibaba:mainfrom
Conversation
…essor span support Automatically instrument BaseDocumentCompressor subclasses to emit rerank spans with OpenTelemetry semantic conventions. Uses __init_subclass__ hook to cover classes defined after instrumentation.
ca63521 to
7a72725
Compare
There was a problem hiding this comment.
Pull request overview
Adds LangChain Embeddings instrumentation to loongsuite-instrumentation-langchain, emitting GenAI embedding spans via opentelemetry-util-genai per the GenAI semantic conventions.
Changes:
- Introduces an Embeddings patch that wraps
embed_*/aembed_*on all current and futurelangchain_core.embeddings.Embeddingssubclasses. - Adds a comprehensive embedding span test suite (sync/async, error cases, dedup/proxy behavior, late subclassing, uninstrumentation).
- Documents embedding spans in the README and records the feature in the changelog.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| instrumentation-loongsuite/loongsuite-instrumentation-langchain/src/opentelemetry/instrumentation/langchain/internal/patch_embedding.py | New patch module that wraps Embeddings methods and emits embedding spans through ExtendedTelemetryHandler. |
| instrumentation-loongsuite/loongsuite-instrumentation-langchain/src/opentelemetry/instrumentation/langchain/init.py | Wires embedding instrumentation into the instrumentor’s instrument/uninstrument lifecycle. |
| instrumentation-loongsuite/loongsuite-instrumentation-langchain/tests/test_embedding_spans.py | New tests validating span creation, attributes, errors, deduplication, and uninstrument behavior for embeddings. |
| instrumentation-loongsuite/loongsuite-instrumentation-langchain/README.md | Adds “Embedding” span kind and its key attributes to the supported spans table. |
| instrumentation-loongsuite/loongsuite-instrumentation-langchain/CHANGELOG.md | Notes embedding span support in the Unreleased “Added” section. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # 2. Install an __init_subclass__ hook so future subclasses are | ||
| # patched automatically. | ||
| _original_init_subclass = Embeddings.__dict__.get("__init_subclass__") | ||
|
|
||
| @classmethod # type: ignore[misc] |
There was a problem hiding this comment.
instrument_embeddings() is not idempotent: if the instrumentor’s instrument() is called multiple times (there is an existing test that does this), _original_init_subclass gets overwritten with the already-patched hook. A subsequent uninstrument_embeddings() will then restore Embeddings.__init_subclass__ to the patched hook instead of the true original, meaning future subclasses can still be auto-patched after uninstrumentation. Consider guarding against re-instrumentation (e.g., a module-level “already_instrumented” flag / storing the original only once) and skipping re-installing the hook when it’s already patched by this module.
| """Extract a model name from an Embeddings instance (if available).""" | ||
| for attr in ("model", "model_name", "model_id", "deployment_name"): | ||
| val = getattr(instance, attr, None) | ||
| if val and isinstance(val, str): | ||
| return val | ||
| return "" | ||
|
|
||
|
|
There was a problem hiding this comment.
_extract_embedding_model() returns an empty string when no model attribute is found, but EmbeddingInvocation.request_model is used for span naming and (when non-empty) the required gen_ai.request.model attribute. An empty string yields a span name like "embeddings " and omits the model attribute entirely. Consider falling back to a more informative value (e.g., the embeddings class name) to avoid producing an effectively unnamed model in spans.
| """Extract a model name from an Embeddings instance (if available).""" | |
| for attr in ("model", "model_name", "model_id", "deployment_name"): | |
| val = getattr(instance, attr, None) | |
| if val and isinstance(val, str): | |
| return val | |
| return "" | |
| """Extract a model name from an Embeddings instance (if available). | |
| If no explicit model attribute is found, fall back to the class name | |
| so that spans and attributes never use an empty model identifier. | |
| """ | |
| for attr in ("model", "model_name", "model_id", "deployment_name"): | |
| val = getattr(instance, attr, None) | |
| if val and isinstance(val, str): | |
| return val | |
| # Fallback: use the embeddings class name as a best-effort model identifier | |
| cls = type(instance) | |
| return cls.__name__ or repr(cls) |
Cirilla-zmh
left a comment
There was a problem hiding this comment.
Thanks for this great PR! Some comments should be resolved before getting merged.
|
|
||
| ### Added | ||
|
|
||
| - Langchain embedding span support([#157](https://github.com/alibaba/loongsuite-python-agent/pull/157)) |
There was a problem hiding this comment.
| - Langchain embedding span support([#157](https://github.com/alibaba/loongsuite-python-agent/pull/157)) | |
| - LangChain embedding span support | |
| ([#157](https://github.com/alibaba/loongsuite-python-agent/pull/157)) |
| return wrapper | ||
|
|
||
|
|
||
| def _make_aembed_query_wrapper( |
There was a problem hiding this comment.
Why not reuse _make_aembed_documents_wrapper?
There was a problem hiding this comment.
I'll check these comments tomorrow :)
|
@minimAluminiumalism Hey! Thank you for your outstanding contributions over time. We are very pleased to invite you to become an approver for If it’s convenient for you, you’re also welcome to join our DingTalk group and contact “希铭” in the group. We look forward to connecting with you more closely. |

Description
Implement #142
Semconv ref https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/#embeddings
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
instrumentation-loongsuite/loongsuite-instrumentation-langchain/tests/test_embedding_spans.py
Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.