Skip to content

Search: Include synonym ID in synonym rule values#2999

Open
reakaleek wants to merge 2 commits intomainfrom
just-zydeco
Open

Search: Include synonym ID in synonym rule values#2999
reakaleek wants to merge 2 commits intomainfrom
just-zydeco

Conversation

@reakaleek
Copy link
Copy Markdown
Member

What

Fix synonym rules to include the ID term in the synonym set values sent to Elasticsearch.

Why

The synonym dictionary uses the first term as the key (and rule ID) with remaining terms as values. When publishing, only the values were joined into the synonym string — omitting the key term. For example, a rule with ID ilm would only have index lifecycle management as its synonym, meaning searching for "ilm" wouldn't trigger the synonym expansion.

How

Prepend the synonym ID to the values array when building the comma-separated synonym string in PublishSynonymsAsync.

Test plan

  • Verify synonym rules in Elasticsearch include the ID term (e.g. "ilm, index lifecycle management" instead of just "index lifecycle management")
  • Search for abbreviation terms (like "ilm") and confirm they match their expanded forms

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3d621791-0032-45d7-8205-31b7b0073a31

📥 Commits

Reviewing files that changed from the base of the PR and between e97ff8d and 5cea709.

📒 Files selected for processing (9)
  • .superset/config.json
  • src/Elastic.Documentation.Configuration/Search/SearchConfiguration.cs
  • src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs
  • tests-integration/Elastic.Assembler.IntegrationTests/TestHelpers.cs
  • tests-integration/Mcp.Remote.IntegrationTests/McpToolsIntegrationTestsBase.cs
  • tests/Elastic.ApiExplorer.Tests/TestHelpers.cs
  • tests/Elastic.Changelog.Tests/Changelogs/ChangelogTestBase.cs
  • tests/Elastic.Documentation.Build.Tests/TestHelpers.cs
  • tests/Elastic.Markdown.Tests/TestHelpers.cs
✅ Files skipped from review due to trivial changes (6)
  • tests-integration/Mcp.Remote.IntegrationTests/McpToolsIntegrationTestsBase.cs
  • tests-integration/Elastic.Assembler.IntegrationTests/TestHelpers.cs
  • .superset/config.json
  • tests/Elastic.Documentation.Build.Tests/TestHelpers.cs
  • tests/Elastic.ApiExplorer.Tests/TestHelpers.cs
  • tests/Elastic.Markdown.Tests/TestHelpers.cs
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs
  • tests/Elastic.Changelog.Tests/Changelogs/ChangelogTestBase.cs
  • src/Elastic.Documentation.Configuration/Search/SearchConfiguration.cs

📝 Walkthrough

Walkthrough

Synonyms representation changed from a dictionary to a list of string arrays across the codebase: SearchConfiguration.Synonyms and the _synonyms field in ElasticsearchMarkdownExporter are now IReadOnlyList<string[]>. Synonym processing in the exporter was updated to filter and join entire synonym arrays for index-time synonyms and to build SynonymRule entries using the first element as the rule Id and the joined array as Synonyms. Tests and helpers updated to initialize Synonyms as an empty list literal. A new .superset/config.json file was added.

Sequence Diagram(s)

sequenceDiagram
    participant Config as "SearchConfiguration\n(synonyms: IReadOnlyList<string[]])"
    participant Exporter as "ElasticsearchMarkdownExporter\n(_synonyms: IReadOnlyList<string[]])"
    participant ES as "Elasticsearch\n(Synonym API)"

    Config->>Exporter: Provide Synonyms (list of string[] )
    Exporter->>Exporter: Filter entries where any element in fixedSynonyms
    Exporter->>Exporter: For each matching string[] -> join elements -> build indexTimeSynonyms
    Exporter->>Exporter: In PublishSynonymsAsync -> for each string[] s: Id = s[0], Synonyms = join(s)
    Exporter->>ES: Publish SynonymRule(Id, Synonyms) via API
    ES-->>Exporter: Acknowledge
Loading

Suggested labels

breaking

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: including the synonym ID in synonym rule values sent to Elasticsearch.
Description check ✅ Passed The description clearly explains what was fixed, why it matters, and how the fix works, all directly related to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch just-zydeco

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs`:
- Around line 241-242: The publish path (PublishSynonymsAsync) includes the
synonym key when building the synonym string but the constructor logic that
builds indexTimeSynonyms omits it, causing inconsistent synonym sets and hash
inputs; update the constructor/initialization that builds indexTimeSynonyms to
construct synonyms the same way as PublishSynonymsAsync (include the key as the
first token when joining the synonyms) so SynonymRule entries (Id and Synonyms)
are generated identically for both published and index-time paths—refer to
SynonymRule, the id variable and synonym.Value usage and make the join logic
consistent with the existing PublishSynonymsAsync pattern.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e3fcc140-4fe2-4787-ace3-583446a903c8

📥 Commits

Reviewing files that changed from the base of the PR and between ce8705a and 971921e.

📒 Files selected for processing (1)
  • src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs

Copy link
Copy Markdown
Contributor

@cotti cotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CodeRabbit's suggestion makes sense, otherwise LGTM

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/Elastic.Documentation.Configuration/Search/SearchConfiguration.cs`:
- Around line 127-130: The current mapping of searchDto.Synonyms into synonyms
loses the prior uniqueness enforcement on the first term, allowing duplicate
rule IDs (used as s[0] in ElasticsearchMarkdownExporter.Export or similar) to be
produced; update the transformation in SearchConfiguration (where
searchDto.Synonyms is processed) to enforce uniqueness on the first term — e.g.,
group by s.First()/s[0] and either deduplicate (take first occurrence) or throw
a validation exception when duplicates exist — so that the resulting synonyms
array contains only one entry per unique first term and prevents duplicate rule
IDs during export.
🪄 Autofix (Beta)

✅ Autofix completed


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ca3cf4ff-97bd-4ae9-ad57-4c5ad825e155

📥 Commits

Reviewing files that changed from the base of the PR and between 971921e and 72f38ff.

📒 Files selected for processing (9)
  • .superset/config.json
  • src/Elastic.Documentation.Configuration/Search/SearchConfiguration.cs
  • src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs
  • tests-integration/Elastic.Assembler.IntegrationTests/TestHelpers.cs
  • tests-integration/Mcp.Remote.IntegrationTests/McpToolsIntegrationTestsBase.cs
  • tests/Elastic.ApiExplorer.Tests/TestHelpers.cs
  • tests/Elastic.Changelog.Tests/Changelogs/ChangelogTestBase.cs
  • tests/Elastic.Documentation.Build.Tests/TestHelpers.cs
  • tests/Elastic.Markdown.Tests/TestHelpers.cs
✅ Files skipped from review due to trivial changes (6)
  • tests/Elastic.Documentation.Build.Tests/TestHelpers.cs
  • tests-integration/Elastic.Assembler.IntegrationTests/TestHelpers.cs
  • tests-integration/Mcp.Remote.IntegrationTests/McpToolsIntegrationTestsBase.cs
  • tests/Elastic.Markdown.Tests/TestHelpers.cs
  • tests/Elastic.ApiExplorer.Tests/TestHelpers.cs
  • .superset/config.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Fixes Applied Successfully

Fixed 1 file(s) based on 1 unresolved review comment.

Files modified:

  • src/Elastic.Documentation.Configuration/Search/SearchConfiguration.cs

Commit: e97ff8d92db8aabb92c96289ce0426fd2c3921be

The changes have been pushed to the just-zydeco branch.

Time taken: 2m 20s

@cla-checker-service
Copy link
Copy Markdown

cla-checker-service bot commented Mar 30, 2026

💚 CLA has been signed

reakaleek and others added 2 commits March 30, 2026 18:25
The synonym dictionary uses the first term as the key and remaining
terms as values. When publishing to Elasticsearch, only the values
were included in the synonym string, omitting the key term itself.
This meant e.g. "ilm" was the rule ID but not part of the synonym
set, so it wouldn't match "index lifecycle management".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The dictionary structure used the first term as the key and Skip(1) for
values, which artificially separated one synonym from its group. This
caused the key term to be omitted from synonym rules sent to
Elasticsearch.

Replace Dictionary<string, string[]> with IReadOnlyList<string[]> so all
terms in a synonym group are treated equally. The first term is still
used as the ES rule ID for readability but is no longer excluded from
the synonym string.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants