Edgewall Software
Modify

Opened 9 years ago

Closed 6 years ago

#1754 closed defect (fixed)

unnecessary base64 encoding in trac (ticket) emails

Reported by: xris <xris*siliconmechanics*com> Owned by: eblot
Priority: normal Milestone: 0.10
Component: general Version: devel
Severity: normal Keywords: notification email
Cc: Axel.Thimm@…, petr.hroudny@…
Release Notes:
API Changes:

Description

unnecessary base64 encoding is a sign of many spam/virus messages, and many people's mail servers block these messages outright (including mine). Is there any specific reason why trac base64-encodes its outbound emails? If not, please don't do so — I don't like having to choose between getting no trac emails, and getting more spam.

Attachments (0)

Change History (16)

comment:1 Changed 9 years ago by eblot

As Trac supports the UTF8 'charset', it should be able to send UTF8 encoded messages.

Base64 is the encoding choice for UTF8 (quoted printable would not be effective, btw)

comment:2 Changed 9 years ago by xris <xris*siliconmechanics*com>

but it should only send them IF there is utf-8 content, no?

However, I think the main issue is for emails where the sole content of the message is base64-encoded, with no raw-text format given. But honestly, I'm not entirely sure what it takes to trigger my mailserver's "no unnecessary base64" filter.

comment:3 Changed 9 years ago by eblot

That would be an optimisation, may be.

Everything in Trac is stored as UTF-8, I believe. If the whole message is made of pure ASCII characters, Base64 encoding might be disabled, but that means the message would have to be parsed, checked that UTF-8 stream only contains characters < 127 , and to disable Base64. Much work for a few gains: Base64 works gracefully in all cases.

Encoding pure ASCII in Base64 is valid, even not optimal.

I'm not sure to understand what you meant by "no raw-text format" ?

comment:4 Changed 9 years ago by xris <xris*siliconmechanics*com>

many anti-spam measures flag messages with no plaintext portion — particularly base64-only messages because that technique is too often used by spammers to obscure their messages from things that would parse message content.

Anyway, what's wrong with quoted-printable and 'Content-Type: text/plain; charset="utf-8"' ?

comment:5 Changed 9 years ago by eblot

Quoted Printable is a poor encoding system: 3 characters for any non ASCII byte: for any char code ≥ 128, QP generates a triplet "=XX", where XX is the hexadecimal value of the character code. This means up to a 200% increase of the message size.

Conversely, Base64 maps every character (ASCII or not) from a 8bit range to a 6bit range. This means a 25% increase of the message size.

QP is an acceptable encoding for (mostly western) european languages, where most characters are ASCII, with some percentage of special characters in the range [127..256].

With other languages (not based on the roman/latin alphabet), every character code is > 256 in Unicode, which means that every UTF8 character would be translated into many ASCII character (from 6 to 9 ?).

There's nothing wrong with UTF8/Base64. This is not an exotic combination, it is quite regular, I believe.

comment:6 Changed 9 years ago by xris <xris*siliconmechanics*com>

You're worried about size on a 2k email? What about for the rest of us who would have a 0% increase with QP because there's no UTF-8 content? Nonetheless, it's your prerogative to choose, but I know that there are more people than myself out there who have had to turn off spam filtering (resulting in a 20-30% increase in spam in my case) in order to receive trac ticket messages, and it would be nice if there was at least an option to use QP instead of base64 (I mean, it's ONE option to pass into the python mail function). Trac is the only legit sender of pure-base64 content that I know of, and QP *is* the standard way (at least in my experience) to handle utf-8 content, even if it may not be the most efficient.

comment:7 Changed 9 years ago by eblot

I never said I did not want an option for QP: I really don't mind, I only tried to explain the choice of Base64, from my understanding. It was about technical details, not personal wills.

I get a lot of spam too, hopefully destroyed by Thunderbird or Mail, but from my perspective, I would prefer to fix the issue (the SPAM filter that takes some assumptions) rather than to change a specific product to match the SPAM filter criteria.

I'm not worried about a 2k email. I'm worried about trading Base64 for QP, as from my perspective, it is a non sense for non 'latin' languages. As long as the QP encoding is is an option, I have no problem with it.

comment:8 Changed 9 years ago by xris <xris*siliconmechanics*com>

Ok, just sounded from the conversation like you were saying "it's base64, and going to stay that way".. If the resolution of this ticket is "added an option for QP encoding instead of base64" I'm totally cool with that (not a python coder, or I'd have already submitted a patch), and can be patient, too.

(as for handling it in tbird, I run my own mail server, and would rather not be paying for the bandwidth to receive spam if I can block 20% or so of it before it gets to me, and tarpit the sender in the process)

comment:9 Changed 9 years ago by eblot

No problem. BTW, I did not change the bug status, it is still reported as 'new'

comment:10 Changed 9 years ago by Axel.Thimm@…

  • Cc Axel.Thimm@… added

utf-8/base64 seems to be harmful for sourceforge.net mail services, too.

While it is sourceforge's mail processing (adding a signature) that is broken, it would be nice to have a way to work around it.

I just activated ticket notification onto such a list and very wierd trailing garbage poped up at the end of the notifying mails.

http://sourceforge.net/mailarchive/forum.php?thread_id=8115721&forum_id=27233

comment:11 Changed 9 years ago by eblot

  • Keywords notification email added

comment:12 Changed 9 years ago by eblot

  • Owner changed from jonas to eblot
  • Status changed from new to assigned

A new notification implementation, is available here [2357]:

To setup encoding scheme, add a new line in the [notification] section of your trac.ini and select one of the available options:

[notification]
mime_encoding = base64 | qp | none
  • none: for plain english only (notifications won't be sent if tickets contain non-ASCII charaters)
  • qp: quoted-printable, best for western-european languages
  • base64: works with any language (default value)

Feedback welcomed.

comment:13 Changed 9 years ago by eblot

  • Milestone set to 0.10
  • Resolution set to fixed
  • Status changed from assigned to closed

[2799] provides an option to select the encoding type

comment:14 Changed 7 years ago by dharana@…

Thank you for fixing it, useful for me too.

comment:15 Changed 6 years ago by petr.hroudny@…

  • Cc petr.hroudny@… added
  • Resolution fixed deleted
  • Status changed from closed to reopened

Just found this - and I believe an important option is missing.

It's perfectly valid to generate 8-bit UTF-8 emails today, when the following MIME headers are used:

Content-Type: text/plain; charset=utf-8 
Content-Transfer-Encoding: 8bit

An example MUAs doing that are Thunderbird, Mutt, …

Thus another option for 8bit notifications is needed and I'd even say it should be the default.

comment:16 Changed 6 years ago by rblank

  • Resolution set to fixed
  • Status changed from reopened to closed

Would you mind opening a new ticket for this enhancement request? We usually close tickets against a given milestone to keep track of the version where a feature has been added.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed The owner will remain eblot.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from eblot to the specified user.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.