Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate what to do about HTML element nesting depth #227

Open
mozfreddyb opened this issue May 8, 2024 · 2 comments
Open

Investigate what to do about HTML element nesting depth #227

mozfreddyb opened this issue May 8, 2024 · 2 comments
Milestone

Comments

@mozfreddyb
Copy link
Collaborator

It looks like that most browsers have aligned on a maximum nesting depth for HTML elements.
E.g., <b><b><b>... repeated ad infinitum will result in an HTML document that goes exactly 512 elements deep (counted from where? doctype? html? body?), where all subsequent elements are siblings of the maximum deepest allowed element.

This might cause some funky confusion where some harmful <img src=x onerror=alert()> could be perceived as a child of e.g., <template> (and therefore harmless) but ends up as a sibling. Thus leading to XSS.

It also seems that this is not directly specified anywhere (please correct me if I am wrong).

Fun!

@mozfreddyb mozfreddyb added this to the v1 milestone May 8, 2024
@benbucksch
Copy link

benbucksch commented May 8, 2024

There are going to be lots of edge cases in parsing, e.g. <a href//foo> etc., unclosed tags and where the content ends up etc.. If the sanitizer uses the native HTML parser of the browser, and then sanitizes the parsed tree, this would be prevented, no?

(There could also be multiple steps in the sanitization, where the HTML is first sanitized on a string level, then parsed, then sanitized on an HTML level.)

@mozfreddyb
Copy link
Collaborator Author

That's my point, right.
setHTML roughly does 1) parse input into fragment, 2) sanitize the resulting fragment, 3) append fragment to context

If the 512 limit is enforced during (3) we can not sanitize properly because we don't know what's going to be a sibling or child and we're going to have a bad time. If it is enforced and applied in (1), then we're fine.

We should find out and create a test case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants