Fix double-escaping in parseTypesMarkdown#1462
Conversation
Reorder `decodeHtmlEntities` so `&` is replaced last, preventing inputs like `&lt;` from being decoded twice into `<` instead of `<`.
Deploy previewhttps://deploy-preview-1462--mui-internal.netlify.app/ Bundle sizeTotal Size Change: 0B(0.00%) - Total Gzip Change: 0B(0.00%) Show details for 29 more bundles@mui/internal-docs-infra/abstractCreateDemo parsed: 0B(0.00%) gzip: 0B(0.00%) PerformanceTotal duration: 16.93 ms -0.59 ms(-3.4%) | Renders: 4 (+0) | Paint: 72.40 ms -3.04 ms(-4.0%) No significant changes — details Check out the code infra dashboard for more information about this PR. |
There was a problem hiding this comment.
Pull request overview
This PR adjusts HTML entity decoding in the docs-infra parseTypesMarkdown pipeline to prevent double-unescaping (e.g. avoiding &lt; becoming <), addressing the CodeQL js/double-escaping alert.
Changes:
- Reordered
decodeHtmlEntities()replacements so&is decoded after other entities, preventing double-unescaping. - Added a regression test ensuring inline-code types like
`&lt;T&gt;`decode to<T>(not<T>).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| packages/docs-infra/src/pipeline/loadServerTypesText/parseTypesMarkdown.ts | Reorders HTML entity decoding to avoid double-unescaping. |
| packages/docs-infra/src/pipeline/loadServerTypesText/parseTypesMarkdown.test.ts | Adds a regression test for inline-code entity decoding. |
| it('should not double-unescape HTML entities in inline code', async () => { | ||
| // `&lt;` must decode to `<` (single unescape), not `<`. | ||
| const markdown = `### Button | ||
|
|
||
| **Button Props:** | ||
|
|
||
| | Prop | Type | Default | Description | | ||
| | :--- | :--- | :--- | :--- | | ||
| | name | \`&lt;T&gt;\` | - | - | | ||
| `; | ||
|
|
||
| const result = await parseTypesMarkdown(markdown); | ||
| const componentMeta = (result.exports.Button.type as { data: ComponentTypeMeta }).data; | ||
| expect(componentMeta.props.name.typeText).toBe('<T>'); | ||
| }); |
There was a problem hiding this comment.
@copilot apply changes based on this feedback
There was a problem hiding this comment.
Added a second test case in commit 7e25cb0 that asserts &amp;lt;T&amp;gt; decodes to &lt;T&gt; (only one level of & unescaped), covering the double-encoded pattern mentioned in the feedback.
…rkdown Agent-Logs-Url: https://github.com/mui/mui-public/sessions/6a32e19a-b96a-4d2a-b0b9-14456aa287c1 Co-authored-by: Janpot <2109932+Janpot@users.noreply.github.com>
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Summary
Reorder
decodeHtmlEntitiesinparseTypesMarkdownso&is unescaped last. The previous order meant inputs like&lt;would be unescaped to<and then to<, instead of the correct<. Fixes CodeQL alert #65 (js/double-escaping).Ideally markdown isn't used as a data interchange format