Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected, unredable characters in generated HTML #1667

Open
UNOwen opened this issue Sep 9, 2024 · 5 comments
Open

Unexpected, unredable characters in generated HTML #1667

UNOwen opened this issue Sep 9, 2024 · 5 comments

Comments

@UNOwen
Copy link

UNOwen commented Sep 9, 2024

Describe the Bug

When rendering HTML using react-email 3.0.1, I randomly observe unprintable characters rendered in between or instead of the non-latin letters of the email text, no more than 1-2 occurrences per email typically. The issue seems semi-random, dependent on the amount of react markup and text before the affected character, and also dependent on the execution environment (locally rendered preview and server-side cloudflare worker insert these "broken" characters at different parts of the rendered HTML).

Example:

<Text className="text-zinc-600 dark:text-zinc-300 leading-6">
        Поздравляю! Вы завершили офлайн-курс по улучшению речи и литературному мастерству Litcondit. 
        Если вам понравились занятия и вы заинтересованы в том, чтобы научиться виртуозно говорить и писать, 
        а еще разбираться в литературе, приходите на занятия:
</Text>

Result:

Поздравляю! Вы завершили офлайн-курс по улучшению речи и литерату�рному мастерству Litcondit. 
Если вам понравились занятия и вы заинтересованы в том, чтобы научиться виртуозно говорить и писать, 
а еще разбираться в литературе, приходите на занятия:

(note the character)

        "@react-email/components": "^0.0.24",
        "@react-email/preview": "0.0.11",
        "react-email": "3.0.1",

Which package is affected (leave empty if unsure)

react-email

Link to the code that reproduces this issue

https://github.com/UNOwen/litcondit-worker

To Reproduce

Have the below snippet repeated multiple times:

<Text className="text-zinc-600 dark:text-zinc-300 leading-6">
        Поздравляю! Вы завершили офлайн-курс по улучшению речи и литературному мастерству Litcondit. 
        Если вам понравились занятия и вы заинтересованы в том, чтобы научиться виртуозно говорить и писать, 
        а еще разбираться в литературе, приходите на занятия:
</Text>

Preview the dev version of emails or render in production. Observe broke characters in the email.

Expected Behavior

No unprintable characters in the email.

What's your node version? (if relevant)

No response

@UNOwen UNOwen added the Type: Bug Confirmed bug label Sep 9, 2024
@blnvdanil
Copy link

We have also encountered this bug, it occures randomly, any change to the JSX layout or styles may lead to appearing/disappearing of these characters, it seems to be a \0 character

@gabrielmfern
Copy link
Collaborator

The code to the reproduction seems to be down, can you make it public?

@UNOwen
Copy link
Author

UNOwen commented Sep 13, 2024

The code to the reproduction seems to be down, can you make it public?

Gabriel, I’d like to keep the repository private, but I granted you access.

@UNOwen
Copy link
Author

UNOwen commented Sep 14, 2024

Upon some basic investigation, the character involved has hex byes of EF BF BD, and is the https://www.cogsci.ed.ac.uk/~richard/utf-8.cgi?input=%EF%BF%BD&mode=char "REPLACEMENT CHARACTER". It repeats twice in a row a lot of the time. It overwrites the contents of the email rather than inserting itself in between letters. E.g. hex bytes D0 BF for letter "п" were replaced with EF BF BD EF BF BD. The bytes replaced seem random. The replacing characters are always the same replacement character as far as I could reproduce.

@KevinGregull
Copy link

We see the same issue occuring. For us it happens on the "€" Character. I am quite certain that this is not an input encoding problem but rather something that occurs during the rendering of the HTML. Because we do display the € sign in Order Line-Items which are rendered in a loop. Sometimes a single of the looped € Characters is being replaced by the aformentioned �� characters, while the others render fine. It also happens with Umlauts sometimes "äöü". I've never seen it occur in any standard ASCII Character, thus far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants