I wasn’t sure if I could mention my app so I didn’t use the name. Mail Archiver X Easy can print to PDF but it can’t create PDFs in batch. This can only be done with the main version Mail Archiver X.
Doing PDFs is not that hard when the html exists. Creating the html out of the email parts is the real problem.
I get data from Mail with AppleScript where I have to. That means the accounts and the mailboxes. Because for Gmail all emails are in a single mailbox “All Mails” I need to get the header data for all emails.
The rest of my app is done in Xojo. Creation of pdf out of html is done by using macOS functions after loading the html into a html viewer.
WKHtmlToPDF works fine, but it’s only available for Intel and an ARM version is unlikely. I found a Python library a while ago which creates only very simple pdfs. Other tools to create pdfs out of html are super pricey like Prince.
One of the solutions could well be considered the installing of a ready-made application like the one mentioned above.
For those who would like to get to the bottom of everything with their brains, I confirm Shane Stanley’s remark that the HTML of the message body can be obtained by normal AppleScript parsing. Because, the message source consists of headers followed by a null line (i.e., two contiguous CRLFs) and then an HTML body.
That is, to get HTML, your Apple-script must throw out the headers and the above null line.
Here is a link to the RFC 822 specification, which should be read carefully. Especially the following information, which sheds light on how to distinguish headers from the body (for their subsequent elimination from the source of message):
B.2. SEMANTICS
Headers occur before the message body and are terminated by
a null line (i.e., two contiguous CRLFs).
A line which continues a header field begins with a SPACE or
HTAB character, while a line beginning a field starts with a
printable character which is not a colon.
A field-name consists of one or more printable characters
(excluding colon, space, and control-characters). A field-name
MUST be contained on one line. Upper and lower cases are not dis-
tinguished when comparing field-names.