|
View:
New views
13 Messages
—
Rating Filter:
Alert me
|
|
|
why not doing a test that checks "name"-<email address> pairsHi,
I´m pretty new to SpamAssassin and maybe what I am saying is nonsense or somebody else has suggested this, or the test already exists but I don´t know how to configure it, anyway here is my question. I´ve noticed that some spam messages not marked as spam by spamassassin (the score is lower than the limit I´ve set: 5.0. Those emails usually have some hints that suggest they are probably spam: score about 4.6). These message are addressed to many people in my domain but the names before the email address are random. To explain it more clearly, for example, the recipient in the TO field is something like this: "John" <user1@mydomain.com>. Very ofter the CC field includes other recipients like: "Peter" <user2@mydomain.com>; "Mike" <user3@mydomain.com>; etc... The think is that the email recepients (user1, user2, user3,...) are real, they exist in my domain, but the names "Peter, John, Mike" have nothing to do with "user1, user2, user3", they are picked randomly. Wouldn´t be interesting to have a test that checks the "user name-email address" pairs according to some settings? Regards, Alberto. |
|
|
Re: why not doing a test that checks "name"-<email address> pairsOn Fri, 17 Aug 2007, aag_uk wrote:
> These message are addressed to many people in my domain but the > names before the email address are random. To explain it more > clearly, for example, the recipient in the TO field is something > like this: "John" <user1@...>. Very ofter the CC field > includes other recipients like: "Peter" <user2@...>; > "Mike" <user3@...>; etc... The think is that the email > recepients (user1, user2, user3,...) are real, they exist in my > domain, but the names "Peter, John, Mike" have nothing to do with > "user1, user2, user3", they are picked randomly. (1) Check your MTA options. Some allow you to configure rejection of a message after X number of invalid recipients are given. (2) Consider a rule that adds a point if more than X names appear in the TO: and/or CC: headers. Here are mine (20 is the limit): describe TO_TOO_MANY To: too many recipients header TO_TOO_MANY To =~ /(?:,[^,]{1,80}){20}/ score TO_TOO_MANY 1.50 describe CC_TOO_MANY Cc: too many recipients header CC_TOO_MANY Cc =~ /(?:,[^,]{1,80}){20}/ -- John Hardin KA7OHZ http://www.impsec.org/~jhardin/ jhardin@... FALaholic #11174 pgpk -a jhardin@... key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 ----------------------------------------------------------------------- A sword is never a killer, it is but a tool in the killer's hands. -- Lucius Annaeus Seneca (Martial) 4BC-65AD ----------------------------------------------------------------------- 8 days until The 1928th anniversary of the destruction of Pompeii |
|
|
Re: why not doing a test that checks "name"-<email address> pairsOn Fri, 17 Aug 2007, aag_uk wrote:
> I´ve noticed that some spam messages not marked as spam by spamassassin (the > score is lower than the limit I´ve set: 5.0. Those emails usually have some > hints that suggest they are probably spam: score about 4.6). These message > are addressed to many people in my domain but the names before the email > address are random. To explain it more clearly, for example, the recipient > in the TO field is something like this: "John" <user1@...>. Very > ofter the CC field includes other recipients like: "Peter" > <user2@...>; "Mike" <user3@...>; etc... The think is that > the email recepients (user1, user2, user3,...) are real, they exist in my > domain, but the names "Peter, John, Mike" have nothing to do with "user1, > user2, user3", they are picked randomly. Wouldn´t be interesting to have a > test that checks the "user name-email address" pairs according to some > settings? a) is probably going to be quite resource-intensive; b) requires LDAP, NIS, etc., so that SpamAssassin can have a clue about your accounts; c) requires competent fuzzy matching so that, when a user sends mail to "Chris St. Pierre <stpierre@...>", it doesn't flag it as spam because my "real name" is Christopher; d) is prone to FPs, since its the clients who add that name, and it could be literally _anything_ ("chris", "some guy", "", etc.) without being spam; and e) is fairly site-specific and would require a fair amount of configuration. It might be an interesting plugin, but I think that the kind of scoring I'd be comfortable doing for a plugin like that -- very low -- wouldn't be worth the tradeoff in CPU time, network traffic, etc. Chris St. Pierre Unix Systems Administrator Nebraska Wesleyan University |
|
|
Re: why not doing a test that checks "name"-<email address> pairsAt 13:58 17-08-2007, Chris St. Pierre wrote:
>That's an interesting idea, but it > >a) is probably going to be quite resource-intensive; Not really. >c) requires competent fuzzy matching so that, when a user sends mail >to "Chris St. Pierre <stpierre@...>", it doesn't flag it >as spam because my "real name" is Christopher; That's the main problem. There are also misspellings which are difficult to catch. >d) is prone to FPs, since its the clients who add that name, and it >could be literally _anything_ ("chris", "some guy", "", etc.) without >being spam; and It could be used for negative scoring when the client hits reply to answer your message. That would also let some spam through though as some use the real name. Regards, -sm |
|
|
Re: why not doing a test that checks "name"-<email address> pairs>>
>> Hi,=20 >> >> I=C2=B4m pretty new to SpamAssassin and maybe what I am saying is nonsense = >> or >> somebody else has suggested this, or the test already exists but I don=C2= >> =B4t >> know how to configure it, anyway here is my question. >> >> I=C2=B4ve noticed that some spam messages not marked as spam by spamassassi= >> n (the >> score is lower than the limit I=C2=B4ve set: 5.0. Those emails usually have= >> some >> hints that suggest they are probably spam: score about 4.6). These message >> are addressed to many people in my domain but the names before the email >> address are random. To explain it more clearly, for example, the recipient >> in the TO field is something like this: "John" <user1@...>. Very >> ofter the CC field includes other recipients like: "Peter" >> <user2@...>; "Mike" <user3@...>; etc... The think is that >> the email recepients (user1, user2, user3,...) are real, they exist in my >> domain, but the names "Peter, John, Mike" have nothing to do with "user1, >> user2, user3", they are picked randomly. Wouldn=C2=B4t be interesting to ha= >> ve a >> test that checks the "user name-email address" pairs according to some >> settings?=20 >> >> Regards, >> >> Alberto. Hi, you can do quite a few things to trap mail that probably is rubbish .... but it may be extra work. I use some prefilter in line with forbidden attachment and virus scanning but it could probably be written as a _personal_ plugin. I like mail sent to just the plain email address or in "user" <email> format written exactly as I spell it. I collect mail from some other mailboxes, so of course the rule must know about these other addresses as well. For mail sent to my primary address (at a big isp) I dont like to see another address in the To or Cc The one that really caused work: I dont like mails where my address does not appear in either To or Cc, unless the sender appears in a whitelist. You need to add mailing lists, monthly password reminders from mailing lists, sourceforge addresses, whatnot... Wolfgang Hamann |
|
|
Re: why not doing a test that checks "name"-<email address> pairsThanks for your answer, but the spam I´m trying to identify is not about too many recipients, usually it´s only 5 or 6, and they all contain correct email addresses. The thing is that some spammers make up the name that goes before the email address (e.g. "John Smith"<peter@mydomain.com>) |
|
|
Re: why not doing a test that checks "name"-<email address> pairs>a) is probably going to be quite resource-intensive; I don´t really know, according to http://www.nabble.com/forum/ViewPost.jtp?post=12207486&framed=y sm-7 say that it shouldn´t be >b) requires LDAP, NIS, etc., so that SpamAssassin can have a clue >about your accounts; >c) requires competent fuzzy matching so that, when a user sends mail >to "Chris St. Pierre <stpierre@nebrwesleyan.edu>", it doesn't flag it >as spam because my "real name" is Christopher; >d) is prone to FPs, since its the clients who add that name, and it >could be literally _anything_ ("chris", "some guy", "", etc.) without >being spam; and My idea was that you could have a list that links each recipient to possible names that could be used (basically first name, surname and possibly a short name), not necesary NIS or LDAP. About fuzzy matching I think it shouldn't be difficult to do. It´s something like what Google does when you misspell something or enter something that is not "usual", it suggests you another search and, in my opinion, its guess is usually very good. >e) is fairly site-specific and would require a fair amount of >configuration. well, maybe if you have thousands of users in your domain and you want to enter the names-recipient links (as I explained in the previous paragraph) for the first time, it will require a lot of work. In my case I have about 100 recipients and from time to time I have to add new ones; so, that wouldn't be a problem. >It might be an interesting plugin, but I think that the kind of >scoring I'd be comfortable doing for a plugin like that -- very low -- >wouldn't be worth the tradeoff in CPU time, network traffic, etc. I think is could add a low partial score, but the effect could be good because most of these emails I´m talking about are already quite suspicious, they usually match other tests (e.g. BAYES_99, which already adds a pretty high score). |
|
|
Re: why not doing a test that checks "name"-<email address> pairsAag_uk wrote on Fri, 17 Aug 2007 23:58:05 -0700 (PDT):
> >b) requires LDAP, NIS, etc., so that SpamAssassin can have a clue > >about your accounts; > >c) requires competent fuzzy matching so that, when a user sends mail > >to "Chris St. Pierre <stpierre@...>", it doesn't flag it > >as spam because my "real name" is Christopher; > >d) is prone to FPs, since its the clients who add that name, and it > >could be literally _anything_ ("chris", "some guy", "", etc.) without > >being spam; and > > My idea was that you could have a list that links each recipient to possible > names that could be used (basically first name, surname and possibly a short > name), not necesary NIS or LDAP. About fuzzy matching I think it shouldn't > be difficult to do. It´s something like what Google does when you misspell > something or enter something that is not "usual", it suggests you another > search and, in my opinion, its guess is usually very good. You don't understand at all. What gets put in the comment is up to the sender. They can put *everything* there and it's legit. You do not control it at all and you do not send them a reply "please change my name in your addressbook to xyz". It can be the name, a part of the name, several parts of the name, reverted parts of the name, a company name in all its variations, an acronym, misspelled, something like "Tony's brother", the email address, quoted or bracketed in several ways, could be nothing - too show a few. Such a rule would be prone to a huge number of FPs. It may work for you after a lot of work, but not for others. It's not worth it. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com |
|
|
Re: why not doing a test that checks "name"-<email address> pairs>What gets put in the comment is up to the sender.
>They can put *everything* there and it's legit. You do not control it at all > I know it depends on the sender and everything is legit, but it is also legit if I send an email to somebody talking about the stock market or certain medicine and it could score high when the message is perfectly normal. It´s true that you can put whatever you want there but there are also some restrictions; let´s say, for example, all my users are Spanish,Italian,Russian... so, it´s quite unlikely that somebody tags any of my users with names like Jack, Peter, John (which is the case in 99% of the spam). |
|
|
Re: why not doing a test that checks "name"-<email address> pairsAt 23:58 17-08-2007, aag_uk wrote:
> >a) is probably going to be quite resource-intensive; > >I don´t really know, according to Compared to all the checks performed on a message, it isn't. >My idea was that you could have a list that links each recipient to possible >names that could be used (basically first name, surname and possibly a short >name), not necesary NIS or LDAP. About fuzzy matching I think it shouldn't >be difficult to do. It´s something like what Google does when you misspell >something or enter something that is not "usual", it suggests you another >search and, in my opinion, its guess is usually very good. That's not how "names" work in practice. It may take more than a lookup in your system database. It's not difficult but it requires some work to understand the naming conventions. That may not be possible in a heterogeneous environment. The fuzzy matching is not that easy. Once you get into that, you turn the process into a resource intensive one. >well, maybe if you have thousands of users in your domain and you want to >enter the names-recipient links (as I explained in the previous paragraph) >for the first time, it will require a lot of work. In my case I have about >100 recipients and from time to time I have to add new ones; so, that >wouldn't be a problem. It's only a name/recipient link if we make an assumption about the "display name". Once this becomes a general rule, it will be circumvented. I already have one case where this rule would have the adverse of the intended effect. Regards, -sm |
|
|
Re: why not doing a test that checks "name"-<email address> pairsAag_uk wrote on Sat, 18 Aug 2007 03:33:49 -0700 (PDT):
> it´s quite unlikely that somebody tags any of > my users as I said it may work for you, it will not work for the majority of SA users. The whole effort and the FPs would not be worth it. If you don't believe that, start coding. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com |
|
|
|
|
|
Re: why not doing a test that checks "name"-<email address> pairsKai Schätzl wrote:
>> >> You don't understand at all. What gets put in the comment is up to the sender. >> They can put *everything* there and it's legit. You do not control it at all >> and you do not send them a reply "please change my name in your addressbook to >> xyz". It can be the name, a part of the name, several parts of the name, >> reverted parts of the name, a company name in all its variations, an acronym, >> misspelled, something like "Tony's brother", the email address, quoted or >> bracketed in several ways, could be nothing - too show a few. Such a rule >> would be prone to a huge number of FPs. It may work for you after a lot of >> work, but not for others. It's not worth it. >> while it is up to senders to make up display names, I usually see either - no display name at all - the name exacltly as I spell it (from replies) - the name parts rearranged from a web form submission in worthy mails. If someone decides to put "Idiot" as a display name, I take the liberty to not read it. Maybe some people really get mail sent to "Daddy" or whatever. As others have pointed out, checking display names is a personal thing ... and it seems to work with the mails I receive Wolfgang Hamann |
| Free embeddable forum powered by Nabble | Forum Help |