I have some text which has lots of '= ' and '=E9' strings in it which should be just spaces and è.
What would convert this text to properly readable text? Mutt seems to manage it OK (it's a mail message) but I want to do it myself to display it elsewhere and it doesn't seem to be just a simple iconv() conversion as what seems to be in the message are actual '=E9' strings not iso-8859-1 characters with more than 7 bits.
Presumably it's encoded to fit in strict 7-bit ASCII but how do I unencode it?
On 20-Feb-10 23:51:42, Chris G wrote:
I have some text which has lots of '= ' and '=E9' strings in it which should be just spaces and è.
What would convert this text to properly readable text? Mutt seems to manage it OK (it's a mail message) but I want to do it myself to display it elsewhere and it doesn't seem to be just a simple iconv() conversion as what seems to be in the message are actual '=E9' strings not iso-8859-1 characters with more than 7 bits.
Presumably it's encoded to fit in strict 7-bit ASCII but how do I unencode it?
-- Chris Green
Chris, You have encountered the dreaded "Quoted-Printable". This is an ancient "Content-Transfer-Encoding" which dates back to the old (1990-ish) days of Internet email, when all content had to be transmitted in bytes with values in 00-7F, and "Quoted-Printable" was introduced to allow transmission of characters outside the ASCII range 00-7f. Thus a character in the byte range 80-FF (say 9A hex) was represnted in the email as "=9A". (And, for now, let us avoid what has to be done about the character "=" itself ... ).
You may or may not have a utility program "mmencode"/"mimencode"; recent Linux distributions seem not to install this by default, and Debian seems not to know about it.
On an older (SuSE) distribution I can find
NAME mimencode - Translate to and from mail-oriented encoding formats (Same program also installed as "mmencode".) SYNOPSIS mimencode [-u] [-b] [-q] [-p] [file name] [-o outputfile]
DESCRIPTION The mimencode program simply converts a byte stream into (or out of) one of the standard mail encoding formats defined by MIME, the proposed standard for internet multi- media mail formats. Such an encoding is necessary because binary data cannot be sent through the mail. [...] The "-q" option tells mimencode to use the "quoted-print? able" encoding instead of base64.
The "-u" option tells mimencode to decode the standard input rather than encode it.
Hence (if you have the programme) you could use something like
mmencode -u -q <infile> -o <outfile>
Hoping this helps! (it is becoming unusual to have to deal with "the dreaded QP").
Ted.
-------------------------------------------------------------------- E-Mail: (Ted Harding) Ted.Harding@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 21-Feb-10 Time: 00:25:51 ------------------------------ XFMail ------------------------------
On Sun, Feb 21, 2010 at 12:25:56AM -0000, Ted Harding wrote:
On 20-Feb-10 23:51:42, Chris G wrote:
I have some text which has lots of '= ' and '=E9' strings in it which should be just spaces and è.
What would convert this text to properly readable text? Mutt seems to manage it OK (it's a mail message) but I want to do it myself to display it elsewhere and it doesn't seem to be just a simple iconv() conversion as what seems to be in the message are actual '=E9' strings not iso-8859-1 characters with more than 7 bits.
Presumably it's encoded to fit in strict 7-bit ASCII but how do I unencode it?
-- Chris Green
Chris, You have encountered the dreaded "Quoted-Printable". This is an ancient "Content-Transfer-Encoding" which dates back to the old (1990-ish) days of Internet email, when all content had to be transmitted in bytes with values in 00-7F, and "Quoted-Printable" was introduced to allow transmission of characters outside the ASCII range 00-7f. Thus a character in the byte range 80-FF (say 9A hex) was represnted in the email as "=9A". (And, for now, let us avoid what has to be done about the character "=" itself ... ).
That took me to the right place, thanks. There is a PHP function quoted_printable_decode() which does what I need, this is for some PHP coding I'm doing to display E-Mail messages in a Wiki.