|
View:
New views
15 Messages
—
Rating Filter:
Alert me
|
|
|
Remove header lines matching a specific pattern?Hello,
I'd like to use something like: transport: driver = .... headers_remove = NX-Spam.* Is it too simple, so I can't see it. Or is it not possible at all? I was thinking about some expansion, giving me a list of all message headers, but I'm too stupid to find it ... Best regards from Dresden/Germany Viele Grüße aus Dresden Heiko Schlittermann -- SCHLITTERMANN.de ---------------------------- internet & unix support - Heiko Schlittermann HS12-RIPE ----------------------------------------- gnupg encrypted messages are welcome - key ID: 48D0359B --------------- gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2 7E92 EE4E AC98 48D0 359B - -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?Heiko Schlittermann wrote:
> I'd like to use something like: > > transport: > driver = .... > headers_remove = NX-Spam.* > > > Is it too simple, so I can't see it. Or is it not possible > at all? I was thinking about some expansion, giving me a list of all > message headers, but I'm too stupid to find it ... > > Best regards from Dresden/Germany > Viele Grüße aus Dresden > Heiko Schlittermann See $message_headers on http://www.exim.org/exim-html-current/doc/html/spec_html/ch11.html You could probably use ${filter} and ${map} to obtain the list of headers from $message_headers. ${map} and ${filter} are described on the same page. -- Mike Cardwell - IT Consultant and LAMP developer Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/ -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?Heiko Schlittermann <hs@...> (Mo 13 Jul 2009 14:00:26 CEST):
> Hello, > > I'd like to use something like: > > transport: > driver = .... > headers_remove = NX-Spam.* > Hm. Something like transport: ... headers_remove = <\n ${filter \ {<\n ${map{<\n$message_headers_raw} {${extract{1}{:}{$item}}}}} \ {match{$item}{(?i:^X-)}}} should do the trick. But for now it doesn't. Will continue testing. Thanks for reading :-) Best regards from Dresden/Germany Viele Grüße aus Dresden Heiko Schlittermann -- SCHLITTERMANN.de ---------------------------- internet & unix support - Heiko Schlittermann HS12-RIPE ----------------------------------------- gnupg encrypted messages are welcome - key ID: 48D0359B --------------- gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2 7E92 EE4E AC98 48D0 359B - -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?On 2009-07-13 at 15:03 +0200, Heiko Schlittermann wrote:
> Heiko Schlittermann <hs@...> (Mo 13 Jul 2009 14:00:26 CEST): > > Hello, > > > > I'd like to use something like: > > > > transport: > > driver = .... > > headers_remove = NX-Spam.* > > > > Hm. Something like > > transport: > ... > headers_remove = <\n ${filter \ > {<\n ${map{<\n$message_headers_raw} {${extract{1}{:}{$item}}}}} \ > {match{$item}{(?i:^X-)}}} The ${map{<\n$message_headers_raw} {${extract{1}{:}{$item}}}} takes no account of continuation lines. This is somewhat more accurate: ${map{<\n${filter{<\n$message_headers_raw}{match{$item}{^(?i:[a-z_-]+\s*:)}}}} {${extract{1}{:}{$item}}}} but still prone to false-positives; the problem is more visible if you do: ${map{<\n$message_headers_raw} {[$item]}} and see that the leading tabs on the follow-on lines have been lost as part of whitespace folding. This shouldn't have false positives but leaves you with some extra whitespace in the form of blank lines: ${sg{$message_headers_raw}{\N(?m)(?:^\s|\s*:).*$\N}{}} So this gets you all the actual headers: ${filter{<\n${sg{$message_headers_raw}{\N(?m)(?:^\s|\s*:).*$\N}{}}}{!eq{$item}{}}} Or this for just the X-* headers: ${filter{<\n${sg{$message_headers_raw}{\N(?m)(?:^\s|\s*:).*$\N}{}}}{match{$item}{\N^(?i)X-\N}}} Regards, -Phil -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?Hello Phil,
Phil Pennock <exim-users@...> (Mo 13 Jul 2009 15:51:45 CEST): > On 2009-07-13 at 15:03 +0200, Heiko Schlittermann wrote: > > Heiko Schlittermann <hs@...> (Mo 13 Jul 2009 14:00:26 CEST): > > > Hello, > > > > > > I'd like to use something like: > > > > > > transport: > > > driver = .... > > > headers_remove = NX-Spam.* > > > > > > > Hm. Something like > > > > transport: > > ... > > headers_remove = <\n ${filter \ > > {<\n ${map{<\n$message_headers_raw} {${extract{1}{:}{$item}}}}} \ > > {match{$item}{(?i:^X-)}}} > > The ${map{<\n$message_headers_raw} {${extract{1}{:}{$item}}}} takes no > account of continuation lines. just contains the RFC2822 header part of the message, _unaltered_. > This is somewhat more accurate: > ${map{<\n${filter{<\n$message_headers_raw}{match{$item}{^(?i:[a-z_-]+\s*:)}}}} {${extract{1}{:}{$item}}}} > but still prone to false-positives; the problem is more visible if you > do: > ${map{<\n$message_headers_raw} {[$item]}} > and see that the leading tabs on the follow-on lines have been lost as > part of whitespace folding. > > This shouldn't have false positives but leaves you with some extra > whitespace in the form of blank lines: > ${sg{$message_headers_raw}{\N(?m)(?:^\s|\s*:).*$\N}{}} MESSAGE_HEADERS = \ # replace "\n" with the default list separator ":" and # and lowercase everything ${lc:${sg\ # fix continuation lines and extract the header names # replace every "\n" followed by a number of whitespaces # with a single whitespace {${map\ { <\n ${sg {$message_headers_raw}{\N\n\s+\N}{ }} }\ {${extract{1}{:}{$item}}}\ }\ }\ {\N\n\N}\ {:}\ }} Later in the ACL (I did it in the ACL because we have several processing policies and it seems more straight forward to me to do it there.) acl_check_data: warn set acl_m_headers_remove \ = ${filter {MESSAGE_HEADERS}{match{$item}{\N^x-(?:ius|hh)\N}}} log_message = DEBUG-X: $acl_m_headers_remove ... And all transports look similar to this one: lmtp: driver = smtp ... headers_remove = $acl_m_headers_remove (Would be great to have the counterpart of add_header in the ACL ...) > So this gets you all the actual headers: > ${filter{<\n${sg{$message_headers_raw}{\N(?m)(?:^\s|\s*:).*$\N}{}}}{!eq{$item}{}}} ~~ Thank you for this. Didn't know it. But now I found the reference in perlre(1). Thanks and best regards from Dresden/Germany Heiko Schlittermann -- SCHLITTERMANN.de ---------------------------- internet & unix support - Heiko Schlittermann HS12-RIPE ----------------------------------------- gnupg encrypted messages are welcome - key ID: 48D0359B --------------- gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2 7E92 EE4E AC98 48D0 359B - -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?Heiko Schlittermann schrieb:
> Hello Phil, > > Phil Pennock <exim-users@...> (Mo 13 Jul 2009 15:51:45 CEST): >> On 2009-07-13 at 15:03 +0200, Heiko Schlittermann wrote: >>> Heiko Schlittermann <hs@...> (Mo 13 Jul 2009 14:00:26 CEST): >>>> Hello, >>>> >>>> I'd like to use something like: >>>> >>>> transport: >>>> driver = .... >>>> headers_remove = NX-Spam.* >>>> >>> Hm. Something like >>> >>> transport: >>> ... >>> headers_remove = <\n ${filter \ >>> {<\n ${map{<\n$message_headers_raw} {${extract{1}{:}{$item}}}}} \ >>> {match{$item}{(?i:^X-)}}} >> The ${map{<\n$message_headers_raw} {${extract{1}{:}{$item}}}} takes no >> account of continuation lines. > > Thank you for your response. I missed the fact, that $message_headers* > just contains the RFC2822 header part of the message, _unaltered_. > >> This is somewhat more accurate: >> ${map{<\n${filter{<\n$message_headers_raw}{match{$item}{^(?i:[a-z_-]+\s*:)}}}} {${extract{1}{:}{$item}}}} >> but still prone to false-positives; the problem is more visible if you >> do: >> ${map{<\n$message_headers_raw} {[$item]}} >> and see that the leading tabs on the follow-on lines have been lost as >> part of whitespace folding. >> >> This shouldn't have false positives but leaves you with some extra >> whitespace in the form of blank lines: >> ${sg{$message_headers_raw}{\N(?m)(?:^\s|\s*:).*$\N}{}} > > Now I implented it like this: > > MESSAGE_HEADERS = \ > # replace "\n" with the default list separator ":" and > # and lowercase everything > ${lc:${sg\ > # fix continuation lines and extract the header names > # replace every "\n" followed by a number of whitespaces > # with a single whitespace > {${map\ > { <\n ${sg {$message_headers_raw}{\N\n\s+\N}{ }} }\ > {${extract{1}{:}{$item}}}\ > }\ > }\ > {\N\n\N}\ > {:}\ > }} Heiko, I followed this thread with interest and I'm still a little puzzled with the specific exim syntax, but in terms of regex and just extracting the header names, this perl regex should be more efficient: s/:.*?\n(\s+.*?\n)*/:/g This saves looping through map/extract by getting rid of the unwanted 1st. In exim syntax I'd assume this to be (not tested yet): MESSAGE_HEADERS = ${lc:${sg {$message_headers_raw}{\N:.*?\n(\s+.*?\n)*\N}{:}}} my .02. - Karl -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?Hello Karl,
Karl Fischer <exim-users@...> (Mo 13 Jul 2009 22:54:24 CEST): [ ... ] > Heiko, > > I followed this thread with interest and I'm still a little puzzled with the > specific exim syntax, but in terms of regex and just extracting the header > names, this perl regex should be more efficient: s/:.*?\n(\s+.*?\n)*/:/g > > This saves looping through map/extract by getting rid of the unwanted 1st. > > In exim syntax I'd assume this to be (not tested yet): > > MESSAGE_HEADERS = ${lc:${sg {$message_headers_raw}{\N:.*?\n(\s+.*?\n)*\N}{:}}} ${lc:${sg {X-Spam-Level: 7\nReceived: here\n and now\nSubject: Test\n}{\N:.*?\n(?:\s+.*?\n)*\N}{:}}} gives: x-spam-level:received:subject: Another approach does the same: ${lc:${sg {X-Spam-Level: 7\nReceived: here\n and now\nSubject: Test\n}{\N(?m)(^\S+:)?.*?\n\N}{\$1}}} Cosmetical issue for both: the trailing ":". Thank you for your idea. I was too much focused on extract/map and forgot about the art of reg exp :) -- Heiko -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?On 2009-07-13 at 22:54 +0200, Karl Fischer wrote:
> I followed this thread with interest and I'm still a little puzzled with the > specific exim syntax, but in terms of regex and just extracting the header > names, this perl regex should be more efficient: s/:.*?\n(\s+.*?\n)*/:/g > > This saves looping through map/extract by getting rid of the unwanted 1st. Good point. However, you're also not stripping out space between the header name and the following colon, which is valid. This email could validly be constructed with: ----------------------------8< cut here >8------------------------------ From: Phil .... To : Karl ... Cc : exim-users .... ----------------------------8< cut here >8------------------------------ With a little further optimisation, we get: s/(?>\s*:.*?\n)(?>\s+.*?\n)*/:/g although actually I'm not sure there would be any backtracking needed for your s///g and it's probably only the \s*: that benefits from the protection. (I can't be bothered to benchmark it). > In exim syntax I'd assume this to be (not tested yet): > > MESSAGE_HEADERS = ${lc:${sg {$message_headers_raw}{\N:.*?\n(\s+.*?\n)*\N}{:}}} ${lc:${sg{$message_headers_raw}{\N(?>\s*:.*?\n)(?>\s+.*?\n)*\N}{:}}} -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?On 2009-07-13 at 23:20 +0200, Heiko Schlittermann wrote:
> Just tested it using "exim4 -be" (sorry for the extra long line ...): Put an email into a file and use "exim4 -bem filename" and you'll have $message_headers_raw available to you, using that email to provide the test values. -Phil -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?Phil Pennock <exim-users@...> (Mo 13 Jul 2009 23:44:15 CEST):
> On 2009-07-13 at 22:54 +0200, Karl Fischer wrote: > > I followed this thread with interest and I'm still a little puzzled with the > > specific exim syntax, but in terms of regex and just extracting the header > > names, this perl regex should be more efficient: s/:.*?\n(\s+.*?\n)*/:/g > > > > This saves looping through map/extract by getting rid of the unwanted 1st. > > Good point. > > However, you're also not stripping out space between the header name and > the following colon, which is valid. This email could validly be > constructed with: > ----------------------------8< cut here >8------------------------------ > From: Phil .... > To : Karl ... > Cc : exim-users .... > ----------------------------8< cut here >8------------------------------ > With a little further optimisation, we get: > > s/(?>\s*:.*?\n)(?>\s+.*?\n)*/:/g > > although actually I'm not sure there would be any backtracking needed > for your s///g and it's probably only the \s*: that benefits from the > protection. (I can't be bothered to benchmark it). > > > In exim syntax I'd assume this to be (not tested yet): > > > > MESSAGE_HEADERS = ${lc:${sg {$message_headers_raw}{\N:.*?\n(\s+.*?\n)*\N}{:}}} > > ${lc:${sg{$message_headers_raw}{\N(?>\s*:.*?\n)(?>\s+.*?\n)*\N}{:}}} selecting the head of the logical header line: ${lc:${sg {$message_headers_raw}{\N(?m)(^\S+(?=\s*):)?.*?\n\N}{\$1}}} But I'm not sure about efficency or readability. -- Heiko -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?Phil Pennock <exim-users@...> (Mo 13 Jul 2009 23:45:26 CEST):
> On 2009-07-13 at 23:20 +0200, Heiko Schlittermann wrote: > > Just tested it using "exim4 -be" (sorry for the extra long line ...): > > Put an email into a file and use "exim4 -bem filename" and you'll have > $message_headers_raw available to you, using that email to provide the > test values. Good point. Thank you. -- Heiko -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?On 2009-07-14 at 00:35 +0200, Heiko Schlittermann wrote:
> I'm still at my version - instead of cutting away the tail, I'm > selecting the head of the logical header line: > > ${lc:${sg {$message_headers_raw}{\N(?m)(^\S+(?=\s*):)?.*?\n\N}{\$1}}} The \S+(?=\s*): part doesn't do what I think you think it does. (?=foo) is a zero-width positive lookahead assertion. It matches if and only if followed by foo, but does *not* advance the "current position" past foo. So X(?=\s*): will match if, after matching X, it can match zero or more spaces and then, immediately after X, match a colon. In the degenerate case of zero spaces, this works. But it won't match when there is space. ${lc:${sg {$message_headers_raw}{\N(?m)(?:(^\S+)\s*(:))?.*?\n\N}{\$1\$2}}} I suggest taking a mail header for your -bem test file and inserting some whitespace for testing purposes. -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?Phil Pennock <exim-users@...> (Di 14 Jul 2009 04:55:19 CEST):
> On 2009-07-14 at 00:35 +0200, Heiko Schlittermann wrote: > > I'm still at my version - instead of cutting away the tail, I'm > > selecting the head of the logical header line: > > > > ${lc:${sg {$message_headers_raw}{\N(?m)(^\S+(?=\s*):)?.*?\n\N}{\$1}}} > > The \S+(?=\s*): part doesn't do what I think you think it does. > > (?=foo) is a zero-width positive lookahead assertion. It matches if and > only if followed by foo, but does *not* advance the "current position" > past foo. > > So X(?=\s*): will match if, after matching X, it can match zero or more > spaces and then, immediately after X, match a colon. In the degenerate > case of zero spaces, this works. But it won't match when there is > space. > > ${lc:${sg {$message_headers_raw}{\N(?m)(?:(^\S+)\s*(:))?.*?\n\N}{\$1\$2}}} > > I suggest taking a mail header for your -bem test file and inserting > some whitespace for testing purposes. do, using my mail file (according to your suggestion yesterday) - and - voila - it doesn't do what I think it should do. I'm missing the "To : fred" header :-( and I don't know why I didn't see it yesterday. So, exploring regex goes on ;) Best regards from Dresden/Germany Heiko Schlittermann -- SCHLITTERMANN.de ---------------------------- internet & unix support - Heiko Schlittermann HS12-RIPE ----------------------------------------- gnupg encrypted messages are welcome - key ID: 48D0359B --------------- gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2 7E92 EE4E AC98 48D0 359B - -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?Phil Pennock <exim-users@...> (Mo 13 Jul 2009 23:44:15 CEST):
> On 2009-07-13 at 22:54 +0200, Karl Fischer wrote: > > I followed this thread with interest and I'm still a little puzzled with the > > specific exim syntax, but in terms of regex and just extracting the header > > names, this perl regex should be more efficient: s/:.*?\n(\s+.*?\n)*/:/g > > > > This saves looping through map/extract by getting rid of the unwanted 1st. > > Good point. > > However, you're also not stripping out space between the header name and > the following colon, which is valid. This email could validly be > constructed with: > ----------------------------8< cut here >8------------------------------ > From: Phil .... > To : Karl ... > Cc : exim-users .... > ----------------------------8< cut here >8------------------------------ is the list separator and exim strips the whitespace around the ":" anyway, doesn't it?. |From: hans |To : peter |Received: from me | by you | for him |Subject: nix '${lc:${sg {$message_headers_raw}{\N(?m)(^\S+\s*:)?.*?\n\N}{\$1}}}' from:to :received:subject:message-id:date: So I'd say, it's for our purpose ok. (Or the other alternatives, just stripping away the part following the first ":".) We're hunting the 100% solution just for fun ;) Best regards from Dresden/Germany Viele Grüße aus Dresden Heiko Schlittermann -- SCHLITTERMANN.de ---------------------------- internet & unix support - Heiko Schlittermann HS12-RIPE ----------------------------------------- gnupg encrypted messages are welcome - key ID: 48D0359B --------------- gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2 7E92 EE4E AC98 48D0 359B - -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
|
|
Re: Remove header lines matching a specific pattern?Heiko Schlittermann <hs@...> (Di 14 Jul 2009 09:30:24 CEST):
> Phil Pennock <exim-users@...> (Di 14 Jul 2009 04:55:19 CEST): > > On 2009-07-14 at 00:35 +0200, Heiko Schlittermann wrote: > > > I'm still at my version - instead of cutting away the tail, I'm > > > selecting the head of the logical header line: > > > > > > ${lc:${sg {$message_headers_raw}{\N(?m)(^\S+(?=\s*):)?.*?\n\N}{\$1}}} > > > > The \S+(?=\s*): part doesn't do what I think you think it does. > > > > (?=foo) is a zero-width positive lookahead assertion. It matches if and > > only if followed by foo, but does *not* advance the "current position" > > past foo. > > > > So X(?=\s*): will match if, after matching X, it can match zero or more > > spaces and then, immediately after X, match a colon. In the degenerate > > case of zero spaces, this works. But it won't match when there is > > space. > > > > ${lc:${sg {$message_headers_raw}{\N(?m)(?:(^\S+)\s*(:))?.*?\n\N}{\$1\$2}}} > > > > I suggest taking a mail header for your -bem test file and inserting > > some whitespace for testing purposes. > > ohohoho, I just wanted to show you that it does what I thing it should > do, using my mail file (according to your suggestion yesterday) - and - > voila - it doesn't do what I think it should do. I'm missing the > "To : fred" header :-( and I don't know why I didn't see it yesterday. The input for exim -bem |From: hans |To : peter |Received: from me | by you | for him |Subject: nix The pattern ${lc:${sg {$message_headers_raw}{\N(?m)(^\S+)?\s*(:)?.*?\n\N}{\$1\$2}}} The result from:to:received:subject:message-id:date: Best regards from Dresden/Germany Viele Grüße aus Dresden Heiko Schlittermann -- SCHLITTERMANN.de ---------------------------- internet & unix support - Heiko Schlittermann HS12-RIPE ----------------------------------------- gnupg encrypted messages are welcome - key ID: 48D0359B --------------- gnupg fingerprint: 3061 CFBF 2D88 F034 E8D2 7E92 EE4E AC98 48D0 359B - -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/ |
| Free embeddable forum powered by Nabble | Forum Help |