Skip to content

feat(html): add Go template parser with structural analysis#47

Open
doITmagic wants to merge 5 commits intodevfrom
feat/go-template-parser
Open

feat(html): add Go template parser with structural analysis#47
doITmagic wants to merge 5 commits intodevfrom
feat/go-template-parser

Conversation

@doITmagic
Copy link
Owner

Description

Add semantic indexing for Go template syntax ({{ }}) in HTML, .tmpl, and .gohtml files.

Previously, ragcode indexed HTML files containing Go templates as plain text — it understood the HTML DOM structure (via goquery) but completely ignored Go template directives like {{ define }}, {{ template }}, {{ range }}, {{ if }}, etc. Additionally, .tmpl and .gohtml files were not recognized at all.

What's included

  • GoTemplateAnalyzer — regex-based parser (similar to BladeAnalyzer) that extracts 10 directive types:
    {{ define }}, {{ block }}, {{ template }}, {{ range }}, {{ if/else }}, {{ with }}, {{ .Variable }}, {{ funcName }}, {{/* comments */}}
  • Adapter — converts GoTemplate structs to parser.Symbol with:
    • RelDependency relations ({{ template "nav" }} → dependency to "nav")
    • Rich metadata: variables, custom_funcs, ranges, blocks, defines, includes
    • Per-define symbols (each {{ define "name" }} gets its own symbol)
  • HTML Analyzer integration — automatic dual-mode analysis:
    • Detects {{ in file content regardless of extension (.html, .tmpl, .gohtml)
    • Runs Go template analysis first, then HTML DOM analysis
    • Pure HTML files (no {{ }}) continue to work as before
  • Stack-based EndLine tracking for nested blocks (define → range → if → end)
  • 3 testdata files + 17 unit/integration tests all passing

Architecture decision

Implemented as a sub-package pkg/parser/html/gotemplate/ because:

  1. Go templates are an extension of HTML — they co-exist with HTML structure
  2. .tmpl/.gohtml files already contain HTML — both analyses are valuable
  3. Detection is content-based ({{ presence), not extension-based
  4. Follows the same regex pattern as BladeAnalyzer in pkg/parser/php/laravel/

Files Changed

File Change
pkg/parser/html/gotemplate/types.go NEW — 7 type definitions (GoTemplate, DefineDirective, etc.)
pkg/parser/html/gotemplate/analyzer.go NEW — GoTemplateAnalyzer with 10 regex extractors
pkg/parser/html/gotemplate/adapter.go NEW — GoTemplate → parser.Symbol conversion with relations
pkg/parser/html/gotemplate/analyzer_test.go NEW — 8 analyzer + 3 adapter unit tests
pkg/parser/html/gotemplate/testdata/ NEW — layout.html, page.tmpl, partial.gohtml
pkg/parser/html/analyzer.go MODIFIED — Integrate gotemplate + support .tmpl/.gohtml
pkg/parser/html/analyzer_test.go MODIFIED — 3 integration tests (CanHandle, HTML+GoTpl, .tmpl)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Checklist:

  • I have performed a self-review of my own code
  • I have formatted my code with go fmt ./...
  • I have run tests go test ./... and they pass
  • I have verified integration with Ollama/Qdrant (if applicable)
  • I have updated the documentation accordingly

Add GoTemplateAnalyzer that parses Go template syntax ({{ }}) in HTML,
.tmpl, and .gohtml files. Extracts directives (define, block, template,
range, if/else, with), variables, custom functions, and comments.

Key features:
- Regex-based parser similar to BladeAnalyzer
- Converts to parser.Symbol with RelDependency relations
  ({{ template "x" }} creates dependency to template x)
- Dual-mode analysis: Go template + HTML DOM for all file types
- Detects {{ }} syntax automatically regardless of extension
- Stack-based EndLine tracking for nested blocks
- Rich metadata: variables, custom_funcs, ranges, blocks

Files added:
- pkg/parser/html/gotemplate/types.go     - Type definitions
- pkg/parser/html/gotemplate/analyzer.go  - GoTemplateAnalyzer (regex)
- pkg/parser/html/gotemplate/adapter.go   - GoTemplate → Symbol conversion
- pkg/parser/html/gotemplate/testdata/    - Test fixtures

Files modified:
- pkg/parser/html/analyzer.go     - Integrate gotemplate + .tmpl/.gohtml
- pkg/parser/html/analyzer_test.go - Integration tests

17/17 tests passing
Copilot AI review requested due to automatic review settings March 18, 2026 21:02
@doITmagic doITmagic self-assigned this Mar 18, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds semantic indexing for Go html/template directives embedded in HTML-like files so ragcode can extract template structure (defines/includes/blocks/etc.) instead of treating {{ ... }} as plain text.

Changes:

  • Introduces pkg/parser/html/gotemplate sub-package (types, regex analyzer, symbol adapter) + unit tests/testdata.
  • Extends the HTML analyzer to recognize .tmpl/.gohtml and to run Go-template analysis when {{ is present.
  • Adds integration tests covering Go-template-in-HTML and new extensions.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
pkg/parser/html/gotemplate/types.go Defines parsed directive/container types for Go template extraction.
pkg/parser/html/gotemplate/analyzer.go Regex-based Go template directive extraction + block end-line tracking.
pkg/parser/html/gotemplate/adapter.go Converts parsed templates into parser.Symbols + dependency relations/metadata.
pkg/parser/html/gotemplate/analyzer_test.go Unit tests for directive extraction across sample templates.
pkg/parser/html/gotemplate/adapter_test.go Unit tests for symbol conversion, metadata, and relations.
pkg/parser/html/gotemplate/testdata/layout.html Sample template with define/include/block/range/if/with/comment.
pkg/parser/html/gotemplate/testdata/page.tmpl Sample template with template include + define/range/custom funcs.
pkg/parser/html/gotemplate/testdata/partial.gohtml Sample template with nested if/range and variables.
pkg/parser/html/analyzer.go Adds .tmpl/.gohtml handling and dual Go-template + HTML DOM analysis.
pkg/parser/html/analyzer_test.go Integration tests for Go template detection and new extensions.

You can also share your feedback on Copilot code review. Take the survey.

razvan added 3 commits March 19, 2026 17:56
…ce .ragcode dirs

The cleanWorkspaceData function was treating registry.json as a flat
map[string]interface{} and iterating over top-level keys (version,
entries, candidates) instead of extracting workspace root paths from
entries[].root.

Added extractWorkspaceRoots() that properly handles V2 (struct with
entries array), V1 (plain array), and legacy (flat map) registry formats.

Added comprehensive unit tests covering all registry formats, edge cases,
and an integration test for cleanWorkspaceData.
- Add scanner.Err() check to prevent silent data truncation
- Tighten regex patterns (reBlock/reTemplate/reRange
Copilot AI review requested due to automatic review settings March 19, 2026 16:28
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds semantic indexing support for Go template directives embedded in HTML-like files so ragcode can extract structural symbols/relations from {{ ... }} blocks (including .tmpl/.gohtml) in addition to existing HTML DOM chunking.

Changes:

  • Introduces pkg/parser/html/gotemplate (types + regex analyzer + symbol adapter) and testdata/tests.
  • Updates HTML analyzer to support .tmpl/.gohtml and to run Go-template analysis (content-based {{ detection) alongside DOM analysis.
  • Extends Go analyzer to record template file dependencies from template.ParseFiles / ParseGlob into relations + metadata.
  • Updates uninstall registry parsing to support multiple registry JSON formats + adds tests.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/parser/html/gotemplate/types.go Defines directive/result structs for Go template analysis.
pkg/parser/html/gotemplate/analyzer.go Regex-based extractor with stack-based block tracking.
pkg/parser/html/gotemplate/adapter.go Converts extracted directives into parser.Symbol + relations/metadata.
pkg/parser/html/gotemplate/analyzer_test.go Unit tests for directive extraction (incl. else-if fixture).
pkg/parser/html/gotemplate/adapter_test.go Unit tests for symbol conversion/relations/metadata.
pkg/parser/html/gotemplate/testdata/layout.html Test fixture: define/block/range/if/with/include/etc.
pkg/parser/html/gotemplate/testdata/page.tmpl Test fixture: include + define + range + funcs.
pkg/parser/html/gotemplate/testdata/partial.gohtml Test fixture: if + range nesting.
pkg/parser/html/gotemplate/testdata/elseif.tmpl Test fixture: else-if chain.
pkg/parser/html/analyzer.go Adds .tmpl/.gohtml handling + dual GoTpl/HTML analysis.
pkg/parser/html/analyzer_test.go Integration tests for .tmpl/.gohtml + mixed HTML/GoTpl.
pkg/parser/go/types.go Adds FunctionInfo.TemplateFiles.
pkg/parser/go/analyzer.go Extracts template file args from ParseFiles/ParseGlob; emits deps + metadata.
pkg/parser/go/relations_test.go Tests template file dependency relations + metadata.
internal/uninstall/uninstall.go Adds extractWorkspaceRoots supporting V2/V1/legacy registry formats.
internal/uninstall/uninstall_test.go Tests for new registry parsing + cleanup behavior.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +151 to +156
// Push a new conditional for the else-if branch
tpl.Conditionals = append(tpl.Conditionals, ConditionalDirective{
Condition: strings.TrimSpace(m[1]),
Line: lineNum,
})
stack = append(stack, openBlock{kind: "if", idx: len(tpl.Conditionals) - 1})
Comment on lines +56 to 79
// Detect Go template syntax and run GoTemplate analysis
info, err := os.Stat(path)
if err != nil {
return nil, err
}

if !info.IsDir() {
// Single file: check for Go template syntax
symbols = append(symbols, a.analyzeGoTemplates(path)...)
} else {
// Directory: walk and check each HTML file for Go template syntax
_ = filepath.WalkDir(path, func(fp string, d fs.DirEntry, err error) error {
if err != nil || d.IsDir() {
return nil
}
if a.ca.isHTMLFile(d.Name()) {
symbols = append(symbols, a.analyzeGoTemplates(fp)...)
}
return nil
})
}

// Always run HTML DOM analysis too (Go templates contain HTML)
chunks, err := a.ca.AnalyzePaths([]string{path})
Comment on lines +66 to +76
// Directory: walk and check each HTML file for Go template syntax
_ = filepath.WalkDir(path, func(fp string, d fs.DirEntry, err error) error {
if err != nil || d.IsDir() {
return nil
}
if a.ca.isHTMLFile(d.Name()) {
symbols = append(symbols, a.analyzeGoTemplates(fp)...)
}
return nil
})
}
ast.Inspect(body, func(n ast.Node) bool {
if call, ok := n.(*ast.CallExpr); ok {
var name string
var sel string // selector part (e.g. "template" in template.ParseFiles)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants