Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msgconvert: keep HTML variants of the email (skips multipart/mixed properties) #4

Open
pabs3 opened this issue Apr 24, 2016 · 10 comments

Comments

@pabs3
Copy link

pabs3 commented Apr 24, 2016

Forwarding https://bugs.debian.org/801189

Version: 0.918-1
File: /usr/bin/msgconvert

I attempted to convert a mail containing plain text and HTML variants
but msgconvert only kept the plain text variant, discarding the HTML
variant. It would be nice if it could keep both of them.

pabs@chianamo ~ $ msgconvert --verbose path/to/outlook.msg 
Skipping DIR entry __nameid_version1 0 (Introductory stuff)
...
Skipping property 001F:8004 (UNKNOWN): multipart/mixed; boundary="_009_3C5F9D52E ...
...
Using    property 001F:1000 (BODY_PLAIN): ...
...
@mvz
Copy link
Owner

mvz commented Apr 24, 2016

@pabs3 thanks for your bug report. To implement this, it would be very helpful to have an example file available. Do you have one that you can share with me?

@pabs3
Copy link
Author

pabs3 commented Apr 24, 2016

Unfortunately the .msg I have cannot be shared publicly and I do not
have access to Outlook in order to generate such a message. In case you
have access to outlook at can convert an mbox to .msg format, I have
attached a sample mbox that should match the .msg I found.

bye,
pabs

http://bonedaddy.net/pabs3/

@pabs3
Copy link
Author

pabs3 commented Apr 24, 2016

Github doesn't seem to support attaching files by email, hopefully it does without JavaScript.

@pabs3
Copy link
Author

pabs3 commented Apr 24, 2016

Sigh, seems to need JavaScript and doesn't support mbox files. Uploaded:

test.mbox.zip

@mvz
Copy link
Owner

mvz commented Apr 24, 2016

@pabs3 thanks, I'll see what I can do.

@jpadilla
Copy link

I was also looking for this. Emails can have text/rtf, text/plain, and text/html versions.

@thctlo
Copy link

thctlo commented May 23, 2017

Ping, any update on this one?

@mvz
Copy link
Owner

mvz commented Aug 30, 2020

According to the log, the property that stores the multipart/mixed part has ID '8004', which is in the range reserved for user-defined named properties. It's surprising that there isn't also a property containing just the text/html part (ID '1013')

To be able to handle this different property, Email::Outlook::Message needs to support named properties.

I'm afraid I will also need to have some sample .msg file, since the logging doesn't currently include enough information to find the full name for the user-defined named property. Alternatively, the output of oledump when run on the msg file may be enough.

@ojwb
Copy link
Contributor

ojwb commented Aug 30, 2020

I found a test file in another github repo which hopefully is suitable:

https://github.com/hrbrmstr/msgxtractr/blob/master/inst/extdata/unicode.msg

For this one perl -Ilib script/msgconvert --verbose of current git master says:

Skipping property 001F:8003 (UNKNOWN): multipart/mixed; boundary="001a113392ecbd ...

@mvz
Copy link
Owner

mvz commented Sep 1, 2020

I've looked at the example that @ojwb found and property 001F:8003 is just the content-type and does not contain the full message. That message contains bodies in plain text and RTF format, and the RTF part is RTF-encapsulated HTML. There's already issue #6 about that.

Additionally, I noticed that having RTF as one part of a multipart/alternative content makes it be completely invisible at least to my email reader (Thunderbird).

So, two things need to happen:

  • Render RTF parts as real attachments
  • Convert RTF-encapsulated HTML to HTML and use that instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants