Potential fix for code scanning alert no. 42: DOM text reinterpreted as HTML#4026
Potential fix for code scanning alert no. 42: DOM text reinterpreted as HTML#4026
Conversation
…as HTML Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughA utility function in the Flesch reading ease calculator is refactored to replace DOM-based HTML stripping with regex pattern matching and manual HTML entity decoding, eliminating DOM dependency while normalizing whitespace for text processing. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@phpmyfaq/admin/assets/src/utils/flesch-reading-ease.ts`:
- Around line 58-64: The replace order causes double-unescaping (e.g.,
"&lt;" → "<"); in the flesch-reading-ease code update the HTML-entity
decoding so the "&" replacement runs last: locate the block that mutates the
variable text (the series of text = text.replace(... ) lines) and reorder them
so , <, >, ", ' (and any other specific entity replaces)
happen before the & replacement, leaving the text = text.replace(/&/gi,
'&'); as the final replacement.
…caping Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Potential fix for https://github.com/thorsten/phpMyFAQ/security/code-scanning/42
In general, to fix this kind of issue you should avoid interpreting untrusted text as HTML. Instead of using
DOMParser(..., 'text/html')or injecting intoinnerHTMLto remove tags, use a text-based approach (e.g., regular expressions or a well-known sanitizer library) that treats the string as plain text and strips or escapes HTML markup without parsing it as DOM.For this codebase, the simplest and safest fix is to change
stripHtmlinphpmyfaq/admin/assets/src/utils/flesch-reading-ease.tsso it no longer usesDOMParser. Instead, we can remove angle-bracket tag structures via a regular expression and decode a small set of HTML entities needed for readability calculations. This preserves the function’s purpose—deriving plain text from possibly-marked-up content—while eliminating the step where attacker-controlled input is reinterpreted as HTML.Concretely:
stripHtml, replace:const doc = new DOMParser().parseFromString(html, 'text/html');return doc.body.textContent || '';html.replace(/<[^>]*>/g, ' ') ,&,<,>,",') so the readability calculation operates on human-readable text.No changes are needed in
faqs.editor.tsor in any of the readability functions; they will continue to receive a string and work as before. We don’t need additional imports if we implement this with built-in string operations. This single change will address all alert variants, since the tainted flow only becomes problematic at theDOMParsersink.Suggested fixes powered by Copilot Autofix. Review carefully before merging.
Summary by CodeRabbit