[bug #20142] strip backslash in rfc822 From: field

View: New views
10 Messages — Rating Filter:   Alert me  

[bug #20142] strip backslash in rfc822 From: field

by Laurent Destailleur-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


URL:
  <http://savannah.nongnu.org/bugs/?20142>

                 Summary: strip backslash in rfc822 From: field
                 Project: MHonArc
            Submitted by: jab
            Submitted on: Sunday 06/10/2007 at 19:19
                Category: Resource Variables
                Severity: 3 - Normal
              Item Group: Incorrect Behavior
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
        Operating System: Linux
            Perl Version:  v5.8.4
       Component Version: 2.6.16
           Fixed Release:

    _______________________________________________________

Details:

It is not uncommon to have an escaped character in the From: field of an
email header. I've noticed this mainly with parentheses and quotation marks,
e.g. \( \) \". Out of the last 75000 mails, approximately one in 500 have
this characteristic. Mhonarc leaves the backslash in and produces messages
that look a little funnny. Especially when one person posts a lot to a list,
and each time his or her name is littered with extra backslashes in the index
pages.

I looked over RFC822 and had a tough time deciding whether this practive is
legal or not, but I can say it is fairly common and I think it makes sense to
strip out backslashes in the FROMNAME resource variable. Or maybe only do it
for our friends \" \( and \)

I also looked at Subject: lines, and in those cases backslashes are used all
over the place in a meaninful way, so probably should not be stripped. I'm
happy to suppy additional data.
 
http://www.mail-archive.com/ubuntu-bugs@.../msg121886.html

gen7:/var/mail# cat archive2 | grep ^From: | wc -l
73667
gen7:/var/mail# cat archive2 | grep ^From: | grep '\\' | wc -l
164





    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?20142>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo@... with the
message text UNSUBSCRIBE MHONARC-DEV


[bug #20142] strip backslash in rfc822 From: field

by Laurent Destailleur-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #1, bug #20142 (project mhonarc):

RFC 822 does support the use of '\' to escape characters.

I'll need to do some testing to see if the RFC-822 parser
has a bug or if it is mhonarc.  Mhonarc historically has
tried to do some short-cuts in some cases to avoid full
rfc-822 parsing, but later versions starting leveraging
the 822 parser to be more robust (albeit with a hit on
performance).


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?20142>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo@... with the
message text UNSUBSCRIBE MHONARC-DEV


[bug #20142] strip backslash in rfc822 From: field

by Laurent Destailleur-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #2, bug #20142 (project mhonarc):

This is super useful, and really comes into play for us on $FROMNAME$.

Everything else can essentially stay the same. In particular, $SUBJECT$ will
quite often have unescaped backslashes, for example, a message talking about
Windows software might contain "C:Progam Fileswhatever"

So I admit a hack to fix ( ) " on $FROMNAME$ isn't pretty, but it would be
super useful and if I had a pointer on where to insert it, I'd be happy to
attempt this patch myself.

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?20142>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo@... with the
message text UNSUBSCRIBE MHONARC-DEV


[bug #20142] strip backslash in rfc822 From: field

by Laurent Destailleur-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #3, bug #20142 (project mhonarc):

Do you have any original messages that illustrate
the problem?

I want to make sure thay any work I do will address
the cases you are encountering.

Thanks.


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?20142>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo@... with the
message text UNSUBSCRIBE MHONARC-DEV


[bug #20142] strip backslash in rfc822 From: field

by Laurent Destailleur-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #4, bug #20142 (project mhonarc):

I've placed a sample of raw messages at the following location. It is
encrypted to the mhonarc signing key and is representative of production
traffic. Maybe the size is a little bit of overkill for this particular
problem, but the dataset might be useful for other
problem cases as well. Once the dataset is received, yell and I'll delete it.
To find relevent messages in this dataset, use the grep command described in
one of the other comments.

  http://www.mail-archive.com/mail-sample.bz2.gpg

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?20142>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo@... with the
message text UNSUBSCRIBE MHONARC-DEV


[bug #20142] strip backslash in rfc822 From: field

by Laurent Destailleur-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #5, bug #20142 (project mhonarc):

--- /var/tmp/mhutil.pl  2007-10-09 20:30:36.000000000 -0700
+++ /usr/share/mhonarc/mhutil.pl        2007-10-09 21:32:05.000000000 -0700
@@ -176,7 +176,8 @@
     foreach $tok (@tokens) {
        next  if $skip;
        if ($tok =~ /^"/) {   # Quoted string
           $tok =~ s/^"//;  $tok =~ s/"$//;
+           $tok =~ s/\(.)/$1/g; $tok =~ s/\"//g;
            return $tok;
        }
        if ($tok =~ /^(/) {  # Comment


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?20142>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo@... with the
message text UNSUBSCRIBE MHONARC-DEV


Re: [bug #20142] strip backslash in rfc822 From: field

by Jeff Breidenbach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Here's an unmangled copy of the patch. I think this works,
but the \" part acts a little weird during testing.  (E.g. if I
run -editidx I can fix an index page, but I can't seem to break it
again if I change the code back)

-Jeff

# diff -u /var/tmp/mhutil.pl /usr/share/mhonarc/mhutil.pl
--- /var/tmp/mhutil.pl  2007-10-09 20:30:36.000000000 -0700
+++ /usr/share/mhonarc/mhutil.pl        2007-10-09 22:05:59.000000000 -0700
@@ -177,6 +177,7 @@
        next  if $skip;
        if ($tok =~ /^"/) {   # Quoted string
            $tok =~ s/^"//;  $tok =~ s/"$//;
+            $tok =~ s/\\(.)/$1/g;  $tok =~ s/\\\"/\"/g;
            return $tok;
        }
        if ($tok =~ /^\(/) {  # Comment

---------------------------------------------------------------------
To sign-off this list, send email to majordomo@... with the
message text UNSUBSCRIBE MHONARC-DEV


Re: [bug #20142] strip backslash in rfc822 From: field

by Jeff Breidenbach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ah, now I understand. This is the right patch.

# diff -u /var/tmp/mhutil.pl /usr/share/mhonarc/mhutil.pl
--- /var/tmp/mhutil.pl  2007-10-09 20:30:36.000000000 -0700
+++ /usr/share/mhonarc/mhutil.pl        2007-10-09 22:05:59.000000000 -0700
@@ -177,6 +177,7 @@
- Hide quoted text -
       next  if $skip;
       if ($tok =~ /^"/) {   # Quoted string
           $tok =~ s/^"//;  $tok =~ s/"$//;
+            $tok =~ s/\\(.)/$1/g;
           return $tok;
       }
       if ($tok =~ /^\(/) {  # Comment

---------------------------------------------------------------------
To sign-off this list, send email to majordomo@... with the
message text UNSUBSCRIBE MHONARC-DEV


[bug #20142] strip backslash in rfc822 From: field

by Laurent Destailleur-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #6, bug #20142 (project mhonarc):

Note to self: check if this patch is a possible cause of the $SUBJECT$
$SUSBJECTNA$ weirdness with Tamil.


http://www.mail-archive.com/mhonarc-users@.../msg01387.html

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?20142>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo@... with the
message text UNSUBSCRIBE MHONARC-DEV


[bug #20142] strip backslash in rfc822 From: field

by Laurent Destailleur-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Follow-up Comment #7, bug #20142 (project mhonarc):

I posted a response to your Tamil problem awhile back
to the user's list:
http://www.mhonarc.org/archive/cgi-bin/mesg.cgi?a=mhonarc-users&i=200903230354.n2N3sQD3008778%40gator.earlhood.com

Wrt to the Tamil problem, I could not verify the problem.
Please review my message and follow-up to the list to
either clarify what the problem is or confirm that there
is no problem.

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?20142>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo@... with the
message text UNSUBSCRIBE MHONARC-DEV