[RFC] Split markdown rendering to a separate package.#647
[RFC] Split markdown rendering to a separate package.#647ditman wants to merge 21 commits intogoogle:mainfrom
Conversation
jacobsimionato
left a comment
There was a problem hiding this comment.
Hey thanks for iterating - this seems a lot lighter! It looks like angular support is gone, but ideally it will be possible to use this same shared interface/package for Angular?
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request effectively refactors the markdown rendering logic into a new shared package, @a2ui/markdown-it-shared. This is a positive step towards better code organization and reusability. The dependency injection approach for providing the markdown renderer in both Angular and Lit frameworks is well-executed.
However, I have identified a critical security vulnerability due to an incomplete sanitizer implementation in the new shared package. Additionally, there are a couple of bugs in the new package: one in the HTML highlighting logic and another in a build script, which could break the publishing process. My review includes specific comments and suggestions to address these issues.
| case 'html': { | ||
| const iframe = document.createElement('iframe'); | ||
| iframe.classList.add('html-view'); | ||
| iframe.srcdoc = str; | ||
| iframe.sandbox.add(''); | ||
| return iframe.innerHTML; | ||
| } |
There was a problem hiding this comment.
The highlight function for HTML code blocks is using iframe.innerHTML, which will return an empty string. This is likely not the intended behavior. If the goal is to display a sandboxed preview of the HTML, you should probably be returning the iframe.outerHTML.
Additionally, using document.createElement will cause this code to fail in non-browser environments (e.g., during server-side rendering or in tests running in Node.js). Consider adding a check for the environment.
| case 'html': { | |
| const iframe = document.createElement('iframe'); | |
| iframe.classList.add('html-view'); | |
| iframe.srcdoc = str; | |
| iframe.sandbox.add(''); | |
| return iframe.innerHTML; | |
| } | |
| case 'html': { | |
| if (typeof document === 'undefined') { | |
| return str; // Or escape it | |
| } | |
| const iframe = document.createElement('iframe'); | |
| iframe.classList.add('html-view'); | |
| iframe.srcdoc = str; | |
| iframe.sandbox.add(''); | |
| return iframe.outerHTML; | |
| } |
There was a problem hiding this comment.
This is in fact logic that probably should disappear. I don't think that an html tagged highlight is a good signal to attempt to render an iframe :/
…markdown renderer function.
…rom the Text widget.
8df05c1 to
b70a5eb
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request is a great step towards improving the maintainability and consistency of markdown rendering by centralizing the logic into a new @a2ui/markdown-it-shared package. The use of dependency injection for the renderer in both Angular and Lit frameworks is well-executed. I've identified a couple of critical issues in the new shared markdown renderer that could lead to incorrect rendering and potential security vulnerabilities. Additionally, there are a few medium-severity suggestions to improve code clarity and adhere to best practices for library development. Overall, this is a valuable refactoring, and addressing these points will make it even more robust.
| private markdownIt = markdownit({ | ||
| highlight: (str, lang) => { | ||
| switch (lang) { | ||
| case 'html': { | ||
| const iframe = document.createElement('iframe'); | ||
| iframe.classList.add('html-view'); | ||
| iframe.srcdoc = str; | ||
| iframe.sandbox.add(''); | ||
| return iframe.innerHTML; | ||
| } | ||
|
|
||
| default: | ||
| return sanitize(str); | ||
| } | ||
| }, | ||
| }); |
There was a problem hiding this comment.
The custom highlight function presents two significant issues:
- Buggy HTML block rendering: The
case 'html'is flawed.iframe.innerHTMLwill return an empty string because the iframe is not part of the DOM, andinnerHTMLon an iframe refers to its fallback content. This will result in empty code blocks for any markdown code block fenced withhtml. - Incorrect sanitization: The
defaultcase usessanitize(str). For code blocks, the content should be HTML-escaped to be displayed as literal code, not sanitized. Sanitizing may strip valid code that resembles HTML tags (e.g.,<script>), which is not the desired behavior for a code block.
The safest and most correct approach is to remove the custom highlight function entirely. This will restore markdown-it's default behavior, which correctly escapes content within code blocks.
private markdownIt = markdownit();| private registerTagClassMapRules() { | ||
| const rulesToProxy = [ | ||
| 'paragraph_open', | ||
| 'heading_open', | ||
| 'bullet_list_open', | ||
| 'ordered_list_open', | ||
| 'list_item_open', | ||
| 'link_open', | ||
| 'strong_open', | ||
| 'em_open', | ||
| ]; | ||
|
|
||
| for (const ruleName of rulesToProxy) { | ||
| this.markdownIt.renderer.rules[ruleName] = (tokens, idx, options, env, self) => { | ||
| const token = tokens[idx]; | ||
| const tagClassMap = env?.tagClassMap as Types.MarkdownRendererTagClassMap | undefined; | ||
|
|
||
| if (tagClassMap) { | ||
| const tokenClasses = tagClassMap[token.tag] ?? []; | ||
| for (const clazz of tokenClasses) { | ||
| token.attrJoin('class', clazz); | ||
| } | ||
| } | ||
|
|
||
| return self.renderToken(tokens, idx, options); | ||
| }; | ||
| } | ||
| } |
There was a problem hiding this comment.
The current implementation of registerTagClassMapRules replaces markdown-it's default rendering rules (like link_open) instead of wrapping them. This is a significant issue as it leads to the loss of important default functionality and introduces a security vulnerability.
For example, the default link_open rule normalizes URLs (a security measure to prevent javascript: URLs). By replacing the rule entirely with a call to self.renderToken, this crucial sanitization logic is lost.
The correct approach is to store a reference to the original rule, add the custom class attributes to the token, and then invoke the original rule to preserve its behavior.
private registerTagClassMapRules() {
const rulesToProxy = [
'paragraph_open',
'heading_open',
'bullet_list_open',
'ordered_list_open',
'list_item_open',
'link_open',
'strong_open',
'em_open',
];
for (const ruleName of rulesToProxy) {
const originalRule = this.markdownIt.renderer.rules[ruleName];
this.markdownIt.renderer.rules[ruleName] = (tokens, idx, options, env, self) => {
const token = tokens[idx];
const tagClassMap = env?.tagClassMap as Types.MarkdownRendererTagClassMap | undefined;
if (tagClassMap) {
const tokenClasses = tagClassMap[token.tag] ?? [];
for (const clazz of tokenClasses) {
token.attrJoin('class', clazz);
}
}
if (originalRule) {
return originalRule(tokens, idx, options, env, self);
}
return self.renderToken(tokens, idx, options);
};
}
}| /** | ||
| * The markdown renderer context. | ||
| */ | ||
| export const markdown = createContext<Types.MarkdownRenderer | undefined>("A2UIMarkdown"); |
There was a problem hiding this comment.
For library code, it's a best practice to use a Symbol for the context key instead of a string. This prevents potential naming collisions with other contexts from other libraries that might be used in the same application. This same feedback applies to the theme context in theme.ts.
| export const markdown = createContext<Types.MarkdownRenderer | undefined>("A2UIMarkdown"); | |
| export const markdown = createContext<Types.MarkdownRenderer | undefined>(Symbol.for("A2UIMarkdown")); |
Description
This PR separates the
markdown-itdependency from thelitandangularrendererers, by allowing users to inject their own markdown renderer.Warning
BREAKING CHANGE: By default, now incoming markdown is rendered as a
preelement, and users must inject a markdown renderer of their choosing for markdown to be rendered (see below).In order to keep the "batteries" somewhat included in the sdk, a new package is introduced:
@a2ui/markdown-it-shared: the pre-configuredmarkdown-itinstance for all web renderers. This allows us to have a single configured markdown renderer withmarkdown-itanddompurifythat can be reused across all web packages. This is just a markdown string -> html string converter, but if we want to add plugins tomarkdown-itlater, all packages get the new output at once.The
litandangularrestaurant samples are updated to inject the new markdown renderer.Fixes
Tests
Pre-launch Checklist
If you need help, consider asking for advice on the discussion board.