Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-syntax] Missing emoji in non-ascii identifier codepoints #11005

Open
Conaclos opened this issue Oct 4, 2024 · 5 comments
Open

[css-syntax] Missing emoji in non-ascii identifier codepoints #11005

Conaclos opened this issue Oct 4, 2024 · 5 comments

Comments

@Conaclos
Copy link

Conaclos commented Oct 4, 2024

Type of proposal: enhancement

The non-ASCII ident code point specification doesn't include the Miscellaneous Symbols Unicode block, and the Dingbats Unicode block. However, the emoji of these blocks are allowed by browsers.

Note: should we also accept some of the emoji of the Miscellaneous Technical Unicode block? Or should we even accept all non-ASCII characters?

These codepoints (0+2600 to U+27BF included) could be added to the non-ASCII ident code point specification.

@tabatkins
Copy link
Member

As stated in the spec's note, I just matched HTML's set of valid custom element name characters. We want CSS's idents to at least cover that set, so authors don't have to use escapes when writing selectors to target their custom elements, but we could be a superset.

I'm checking with the HTML editors to see if they remember why these specific ranges were chosen, and if there's a good reason to avoid allowing those emojis. (cc @annevk @domenic )

It does indeed seem a little silly that --🥔 is a valid custom property name, but --✨ isn't.

@tabatkins
Copy link
Member

Or should we even accept all non-ASCII characters?

Not all; again, as stated in the note, there's good reasons to exclude some characters from idents, and Unicode itself even recommends disallowing some characters that are allowed by the current spec. But the emojis seem probably safe.

@annevk
Copy link
Member

annevk commented Oct 4, 2024

We actually want to turn it into a blocklist of sorts: whatwg/dom#1079. The current restrictions follow from XML (which the DOM APIs build on and with which we wanted to be compatible): https://www.w3.org/TR/REC-xml/#NT-NameStartChar

I think it would be okay for CSS to essentially have one or more ASCII alpha or U+0080 through U+10FFFF (and maybe some other ASCII code points?), as long as it starts with two hyphens. No need for HTML parity.

@Conaclos
Copy link
Author

Conaclos commented Oct 7, 2024

Do you have a timeframe for a decision? What safe assumptions can I make from an implementer pov?

@tabatkins
Copy link
Member

No timeframe yet; in particular, I'm missing this week's telcon.

Until you actually see movement in implementations, don't make any assumptions that things will change; continue to treat the valid syntax space as what's in the speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants