Fix source-map emit panic for unclosed blocks#3923
Conversation
Clamp the position in getLineAndCharacter to the text length to prevent a slice bounds out of range panic when the parser produces positions beyond the source text (e.g., for unclosed blocks). Fixes #1281 Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/4965050f-5ceb-40df-bcd2-6c97877ebf70 Co-authored-by: jakebailey <5341706+jakebailey@users.noreply.github.com>
| func (c *lineCharacterCache) getLineAndCharacter(pos int) (line int, character core.UTF16Offset) { | ||
| // Clamp pos to the text length to avoid panics for positions beyond | ||
| // the source text (e.g. synthesized end positions for unclosed blocks). | ||
| if pos > len(c.text) { | ||
| pos = len(c.text) | ||
| } |
There was a problem hiding this comment.
@copilot in general we don't want to add clamping like this everywhere; are you sure this isn't a bug somewhere else?
There was a problem hiding this comment.
You're right — clamping in getLineAndCharacter would mask bugs. The root issue is that for error-recovery nodes (like unclosed blocks), the parser produces end positions beyond the source text. TypeScript's computeLineAndCharacterOfPosition handles this implicitly because it uses pure arithmetic (position - lineStarts[lineNumber]), but our Go version uses string slicing for UTF-16 computation, which panics.
Moved the guard to emitPos (2517294) — positions beyond the source text can't meaningfully be mapped, so we skip them. This is consistent with emitPos's existing guards for synthesized positions, disabled source maps, etc.
Instead of clamping in getLineAndCharacter, skip emitting source mappings in emitPos when the position exceeds the source text length. This is consistent with emitPos's existing guards (synthesized positions, disabled source maps, etc.) and avoids masking bugs elsewhere. For error-recovery nodes like unclosed blocks, the parser produces end positions beyond the source text. TypeScript's equivalent code uses arithmetic (position - lineStarts[lineNumber]) which handles this implicitly, but Go's string-slicing approach for UTF-16 computation would panic. Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/cfe89aba-474b-4ce6-a914-fd1fddc59c71 Co-authored-by: jakebailey <5341706+jakebailey@users.noreply.github.com>
| return | ||
| } | ||
|
|
||
| if pos > len(p.sourceMapSource.Text()) { |
There was a problem hiding this comment.
Traced the full flow: emitBlock calls emitTokenEx(CloseBraceToken, node.Statements.End()=EOF, ...). writeTokenText writes } and returns pos + len("}") = EOF + 1. This past-EOF position flows through exitToken → emitSourceMapsAfterToken → emitPos.
TypeScript has the identical flow — emitTokenWithSourceMap also passes pos + len("}") to emitSourcePos. It doesn't crash because computeLineAndCharacterOfPosition uses arithmetic (position - lineStarts[lineNumber]).
Moved the fix to getLineAndCharacter (8add68b) — the Go equivalent of where TypeScript implicitly handles it. When pos > len(text), it computes UTF-16 length up to EOF and adds the remaining offset arithmetically, matching TypeScript's behavior.
…emitPos The root cause: writeTokenText returns pos + len(tokenString), which goes past EOF for error-recovery tokens (e.g., missing closing braces). This is expected behavior — TypeScript has the identical flow. TypeScript handles it implicitly because computeLineAndCharacterOfPosition uses arithmetic (position - lineStarts[lineNumber]). The Go port's getLineAndCharacter uses string slicing for UTF-16 computation, which panics on out-of-bounds. Fix getLineAndCharacter to compute UTF-16 character offset up to EOF, then add the remaining byte offset arithmetically. This matches TypeScript's behavior exactly: the source map entry is emitted (not skipped) with a character position past the line end. Agent-Logs-Url: https://github.com/microsoft/typescript-go/sessions/671ae780-244e-4f81-819a-9b54bb2981a0 Co-authored-by: jakebailey <5341706+jakebailey@users.noreply.github.com>
tsgo --sourceMappanics with a slice bounds error on files with unclosed blocks (e.g.echo '{' > bug.ts). The parser produces an end position beyond the source text length, andgetLineAndCharacterslices into the text without bounds checking.TypeScript's equivalent
computeLineAndCharacterOfPositionuses pure arithmetic (position - lineStarts[lineNumber]) so it handles out-of-bounds positions implicitly. The Go port'sgetLineAndCharacteruses string slicing for UTF-16 computation, which panics on out-of-bounds access.getLineAndCharacterby computing UTF-16 length up to EOF and adding the remaining byte offset arithmetically, matching TypeScript's implicit arithmetic behaviorsourceMapUnclosedBlock.tsAnalysis
For error-recovery tokens like missing closing braces,
emitBlockcallsemitTokenEx(CloseBraceToken, node.Statements.End(), ...)whereposequals the EOF position.writeTokenTextwrites}and returnspos + len("}") = EOF + 1, which is past the source text. This past-EOF position flows throughexitToken→emitSourceMapsAfterToken→emitSourcePos→emitPos→getLineAndCharacter, where the string slicetext[lineStart:pos]panics.TypeScript has the identical flow —
emitTokenWithSourceMapalso passespos + len("}")toemitSourcePos. It doesn't crash becausecomputeLineAndCharacterOfPositionuses arithmetic (position - lineStarts[lineNumber]) which implicitly handles out-of-bounds positions.Fix
Handle past-EOF positions in
getLineAndCharacter— the Go equivalent of where TypeScript implicitly handles them. Whenpos > len(text), compute the UTF-16 character offset up to EOF using string slicing, then add the remaining byte offset arithmetically. This matches TypeScript's behavior: the source map entry is emitted (not skipped) with a character position past the line end.Copilot Checklist
I successfully ran these commands at the end of my session, and they completed without error: