|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 | Next > |
|
|
Re: Spam PDFbgodette@... wrote:
> John Rudd wrote: >> Robert Schetterer wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> arni schrieb: >>>> Raymond Myren schrieb: >>>>> Hello, >>>>> >>>>> Just today I started receiving spam mails with attached .pdf files >>>>> with a spam image. >>>>> Any ideas how to stop this spam type? >>>>> >>>>> \raymond >>>> as i said several times on this maillist now, i've never had any of >>>> these mails get through, here is how the current ones score: >>>> >>>> X-Spam-Status: Yes, score=16.6 required=5.0 tests=BAYES_99,BOTNET, >>>> BOTNET_NORDNS,DCC_CHECK,DKIM_POLICY_SIGNSOME,HTML_MESSAGE,LOGINHASH1, >>>> LOGINHASH2,MIME_HTML_MOSTLY,RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_PBL,RDNS_NONE >>>> >>>> autolearn=no version=3.2.0 >>>> X-Spam-Report: * 5.5 BAYES_99 BODY: Bayesian spam probability is 99 >>>> to 100% >>>> * [score: 1.0000] >>>> * 0.1 RDNS_NONE Delivered to trusted network by a host with no rDNS >>>> * 2.0 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in >>>> bl.spamcop.net >>>> * [Blocked - see <http://www.spamcop.net/bl.shtml?85.138.88.254>] >>>> * 0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL >>>> * [85.138.88.254 listed in zen.spamhaus.org] >>>> * 3.0 BOTNET Relay might be a spambot or virusbot >>>> * [botnet0.7,ip=85.138.88.254,nordns] >>>> * 0.0 DKIM_POLICY_SIGNSOME Domain Keys Identified Mail: policy says >>>> domain >>>> * signs some mails >>>> * 0.0 BOTNET_NORDNS Relay's IP address has no PTR record >>>> * [botnet_nordns,ip=85.138.88.254] >>>> * 0.0 MIME_HTML_MOSTLY BODY: Multipart message mostly text/html MIME >>>> * 0.0 HTML_MESSAGE BODY: HTML included in message >>>> * 1.5 LOGINHASH2 BODY: mail has been classified as spam @ unknown >>>> company, >>>> * Germany >>>> * 1.5 LOGINHASH1 BODY: mail has been classified as spam @ >>>> LogIn&Solutions >>>> * AG, Germany >>>> * 2.2 DCC_CHECK Listed in DCC (http://rhyolite.com/anti-spam/dcc/) >>>> >>>> arni >>>> >>> you are in a luck, >>> you are a "late reciever" of that spam, so it was detected >>> by others before ( look at your headers ) >>> but it wasnt detected by i.e a plain pdf_spam rule/solution >>> ( like fuzzy_ocr etc ) >>> this is what i am looking for >> His success didn't depend upon that luck. Even without the LOGINHASH* >> and DCC_CHECK, or even BAYES, he still had a high enough score to flag >> it as spam. >> >> > Actually it did, take away the spamtrap fed blackholes (PBL and SPAMCOP) > and the spamtrap fed BAYES as well and it scores a whopping 3.1 thanks > to the BOTNET plugin (which is amazing btw). That hit was all from > late-receiver effect. > Actually, it didn't. The assertion is that if someone else hadn't seen this exact message first, then SA wouldn't have caught it. The PBL (which isn't spamtrap fed, it's collected from ISP published and/or contributed data) would have caught this based upon issues that have nothing at all to do with this message, and most likely nothing at all to do with this current round of spam. It would be based upon the host provider's policy that this host shouldn't send email to the internet. Similarly, the SPAMCOP listing is most likely not related to _this_ message. It is more likely an ongoing abuse issue, so the fact that the host fed a spamtrap at spamcop at some point in the past does not mean that they were "lucky to catch this message". The odds are that the SPAMCOP listing has nothing to do with this message. I would make the same characterization of BAYES. You don't have to see a specific message in the past in order for BAYES to catch it. Therefore, you're not depending upon "luckily not being the first person to see a given message". Just resting upon BAYES, BOTNET, and PBL, you're not "lucky to have caught the message because you're a late receiver". You've caught the message due to a combination of policy, misuse, and historical characteristics of spam in general being used to train your system. |
|
|
Re: Spam PDFRaymond Dijkxhoorn wrote:
> Hi! > >>> Jun 27 14:50:03 vmx80 MailScanner[4491]: Message l5RCnxP8019756 from >>> 212.127.254.149 (idqct@...) to quicknet.nl is spam, >>> SpamAssassin (not cached, score=24.191, required 5, BAYES_50 0.00, >>> BODY_EMPTY 0.50, GMD_PDF_BAD_FUZZY 20.00, GMD_PDF_HORIZ 0.25, >>> GMD_PDF_STOX >>> 1.00, PROLO_NO_URI 0.01, RCVD_IN_WHOIS_BOGONS 2.43) > >> Where did those GMD rules come from? > > Will be announced lateron. > Until its publicly released, you can request it with a simple email to us, see http://www.rulesemporium.com/plugins.htm#pdfinfo Do not reply here, as I only digest, and I expect that subject hardcoded so I can filter properly ;) |
|
|
Re: Spam PDFRaymond Myren wrote:
> Just today I started receiving spam mails with attached .pdf files with > a spam image. > Any ideas how to stop this spam type? Nothing, yet. But since these appear to be an image file encapsulated in a .pdf, it may be possible to get FuzzyOCR to parse them for spam text. -- -John Thompson (john@...) Appleton WI USA |
|
|
Re: Spam PDF> Actually, it didn't. The assertion is that if someone else hadn't seen
> this exact message first, then SA wouldn't have caught it. No, the assertion is that if someone else hadn't seen prior abuse from the sending host first (not this exact message), then SA wouldn't have caught that particular message. That assertion happens to be true for the blacklists, and true for BAYES as well since it would have had to have seen headers (since the payload is vastly different) that look like this sending host in the recent past and been told that it was SPAM. > > The PBL (which isn't spamtrap fed, it's collected from ISP published > and/or contributed data) would have caught this based upon issues that > have nothing at all to do with this message, and most likely nothing at > all to do with this current round of spam. It would be based upon the > host provider's policy that this host shouldn't send email to the internet. Which means, some time, in the past, for whatever reasons that particular IP address did something against someone's policy to end up on that list. The important part being "in the past". > Similarly, the SPAMCOP listing is most likely not related to _this_ > message. It is more likely an ongoing abuse issue, so the fact that the > host fed a spamtrap at spamcop at some point in the past does not mean > that they were "lucky to catch this message". The odds are that the > SPAMCOP listing has nothing to do with this message. Spamcop automatically delists IP addresses over time, to be relisted someone/something has to report new abuse. If you happen to receive the message before anyone has reported the new abuse, well it won't be listed. > I would make the same characterization of BAYES. You don't have to see > a specific message in the past in order for BAYES to catch it. > Therefore, you're not depending upon "luckily not being the first person > to see a given message". Explain how BAYES will have any matching tokens to work on if its from a fresh, never before seen by your system, zombie and there's no message body other than the attachment? All you have to work with is headers which you've never seen before and MIME boundaries which you've never seen before. > Just resting upon BAYES, BOTNET, and PBL, you're not "lucky to have > caught the message because you're a late receiver". You've caught the > message due to a combination of policy, misuse, and historical > characteristics of spam in general being used to train your system. All of which needs prior examples/reporting of messages similar to the one you're trying to detect, that's what "historical characteristics of spam" means. |
|
|
Re: Spam PDFJohn Thompson wrote:
> Raymond Myren wrote: > > >> Just today I started receiving spam mails with attached .pdf files with >> a spam image. >> Any ideas how to stop this spam type? >> > > Nothing, yet. But since these appear to be an image file encapsulated in > a .pdf, it may be possible to get FuzzyOCR to parse them for spam text. > > As was stated earlier... Until its publicly released, you can request a solution from SARE with a simple email via the information at http://www.rulesemporium.com/plugins.htm#pdfinfo -- Dallas Engelken dallase@... http://uribl.com |
|
|
Re: Spam PDFarni wrote:
> bgodette@... schrieb: >> Actually it did, take away the spamtrap fed blackholes (PBL and SPAMCOP) >> and the spamtrap fed BAYES as well and it scores a whopping 3.1 thanks >> to the BOTNET plugin (which is amazing btw). That hit was all from >> late-receiver effect. >> > That sounds a bit like "if we stopped trying to detect spam, we'd fail > to catch it" > Sounds more like "if we didn't rely on other people to have seen this particular abusive host before us and our learning system to have seen past examples of spam that looks a whole lot like this one from headers alone to detect this particular spam, we'd fail to catch it until we've trained our system and the abusive host has been reported to various lists". That's what makes policy (e.g. MTA checks, BOTNET) and behavior based detection work as well as it does, it's proactive instead of reactive. |
|
|
Re: Spam PDFbgodette@... wrote:
>> Actually, it didn't. The assertion is that if someone else hadn't seen >> this exact message first, then SA wouldn't have caught it. > > No, the assertion is that if someone else hadn't seen prior abuse from > the sending host first (not this exact message), then SA wouldn't have > caught that particular message. That assertion happens to be true for > the blacklists, and true for BAYES as well since it would have had to > have seen headers (since the payload is vastly different) that look like > this sending host in the recent past and been told that it was SPAM. Your assertion about bayes is not well supported. It might have been flagged by bayes for reasons that have _NOTHING_ to do with the received headers. >> The PBL (which isn't spamtrap fed, it's collected from ISP published >> and/or contributed data) would have caught this based upon issues that >> have nothing at all to do with this message, and most likely nothing at >> all to do with this current round of spam. It would be based upon the >> host provider's policy that this host shouldn't send email to the internet. > > Which means, some time, in the past, for whatever reasons that > particular IP address did something against someone's policy to end up > on that list. The important part being "in the past". No, it means that the ISP, or possibly net block user, told Spamhaus "it's an end user IP address, and not a mail server". There might be _NO_ previous abuse from that IP address, and they'll still be listed. The "policy" here is NOT the recipient's policy, the sendering network owner's policy. >> Similarly, the SPAMCOP listing is most likely not related to _this_ >> message. It is more likely an ongoing abuse issue, so the fact that the >> host fed a spamtrap at spamcop at some point in the past does not mean >> that they were "lucky to catch this message". The odds are that the >> SPAMCOP listing has nothing to do with this message. > > Spamcop automatically delists IP addresses over time, to be relisted > someone/something has to report new abuse. If you happen to receive the > message before anyone has reported the new abuse, well it won't be listed. It could have been recent abuse from an entirely different message batch. In other words, maybe that IP sent a standard stock scam yesterday, and today it sent the pdf spam ... and this person was the first one to receive that pdf spam message. No previous recipient of the same message. But they'll still be listed at spamcop. >> I would make the same characterization of BAYES. You don't have to see >> a specific message in the past in order for BAYES to catch it. >> Therefore, you're not depending upon "luckily not being the first person >> to see a given message". > > Explain how BAYES will have any matching tokens to work on if its from a > fresh, never before seen by your system, zombie and there's no message > body other than the attachment? All you have to work with is headers > which you've never seen before and MIME boundaries which you've never > seen before. There are more headers than just the received headers. And, I honestly don't know whether or not an attachment's raw data is analyzed by bayes or not. My assumption is that it is. >> Just resting upon BAYES, BOTNET, and PBL, you're not "lucky to have >> caught the message because you're a late receiver". You've caught the >> message due to a combination of policy, misuse, and historical >> characteristics of spam in general being used to train your system. > > All of which needs prior examples/reporting of messages similar to the > one you're trying to detect, that's what "historical characteristics of > spam" means. BOTNET does _NOT_ need prior reporting. And the prior reporting the PBL require has nothing to do with abuse. Further, BAYES does not depend upon the received headers. But even if you're right about bayes, your claim that "all of which needs prior..." is at least 2/3 wrong, if not 3/3 wrong. |
|
|
Re: Spam PDF
bgodette@... schrieb:
I have no spam that doesnt score at least BAYES_80 - BAYES_80 is 3.5 points here, BOTNET is 3 points here, makes 6.5 total and a bust. Doesnt have anything to do with beeing a late reciever as i recieve this spam on a whole lot of addresses and not just one - please dont tell me you think i'm a late reciever on all. arni |
|
|
Re: Spam PDF-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 Dallas Engelken schrieb: > John Thompson wrote: >> Raymond Myren wrote: >> >> >>> Just today I started receiving spam mails with attached .pdf files with >>> a spam image. >>> Any ideas how to stop this spam type? >>> >> >> Nothing, yet. But since these appear to be an image file encapsulated in >> a .pdf, it may be possible to get FuzzyOCR to parse them for spam text. >> >> > > As was stated earlier... > > Until its publicly released, you can request a solution from SARE with a > simple email via the information at > http://www.rulesemporium.com/plugins.htm#pdfinfo > i am lucky to report that your rules matched all pdf spam ( i had 4 ) caught in the past at my servers good work! - -- Mit freundlichen Gruessen Best Regards Robert Schetterer https://www.schetterer.org Germany -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFGhBqqfGH2AvR16oERAgAQAJ9oxicM6V+oEounEOTeLFy1z7DhXQCdF+oV FOpwKaJuhnfGHtLsnQONOqM= =O0xn -----END PGP SIGNATURE----- |
|
|
Re: Spam PDFJohn Rudd wrote:
> The "policy" here is NOT the recipient's policy, the sendering network > owner's policy. That was a rather mangled sentence... The "policy" that is the P in PBL is not the recipient's spam/abuse/etc. policy, it's the sending network owner's policy about who should or shouldn't be allowed to send email out to the internet instead of going through a network-owner controlled mail server. |
|
|
Re: Spam PDFRobert Schetterer wrote:
> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dallas Engelken schrieb: > >> John Thompson wrote: >> >>> Raymond Myren wrote: >>> >>> >>> >>>> Just today I started receiving spam mails with attached .pdf files with >>>> a spam image. >>>> Any ideas how to stop this spam type? >>>> >>>> >>> Nothing, yet. But since these appear to be an image file encapsulated in >>> a .pdf, it may be possible to get FuzzyOCR to parse them for spam text. >>> >>> >>> >> As was stated earlier... >> >> Until its publicly released, you can request a solution from SARE with a >> simple email via the information at >> http://www.rulesemporium.com/plugins.htm#pdfinfo >> >> > Hi Dallas, > i am lucky to report that your rules matched > all pdf spam ( i had 4 ) caught in the past at my servers > good work! > > > Good, as expected. Thanks for the feedback. -- Dallas Engelken dallase@... http://uribl.com |
|
|
Re: Spam PDFRaymond Myren wrote:
> Just today I started receiving spam mails with attached .pdf files with > a spam image. > Any ideas how to stop this spam type? I was able to decode to plain text using the following commands: cat report.pdf | acroread -toPostScript -level2 -saveVM | ps2ascii Finally, very simple. Claude |
|
|
Re: Spam PDFHi!
>> Just today I started receiving spam mails with attached .pdf files with a >> spam image. >> Any ideas how to stop this spam type? > I was able to decode to plain text using the following commands: > > cat report.pdf | acroread -toPostScript -level2 -saveVM | ps2ascii And this scales? :) Bye, Raymond. |
|
|
Re: Spam PDFRaymond Dijkxhoorn wrote:
>> I was able to decode to plain text using the following commands: >> >> cat report.pdf | acroread -toPostScript -level2 -saveVM | ps2ascii > > And this scales? :) It worked for me on an example of the many similar SPAM messages I have got. It will probably not work with any one. Have a try and report us about your own results. Claude |
|
|
Re: Spam PDFHi Clause,
>>> I was able to decode to plain text using the following commands: >>> >>> cat report.pdf | acroread -toPostScript -level2 -saveVM | ps2ascii >> And this scales? :) > It worked for me on an example of the many similar SPAM messages I have got. > It will probably not work with any one. Have a try and report us about your > own results. No i tested acroread but its not exactly a lightweight tool to do this conversions. You can allmost better open the PDF and filter them manually ;) If you get a couple of thousand an hour, like we do now, it aint fun with acroread. Bye, Raymond. |
|
|
Re: Spam PDF>>> I was able to decode to plain text using the following commands:
>>> >>> cat report.pdf | acroread -toPostScript -level2 -saveVM | ps2ascii There are two forms of these PDF spams. The first ones had plain text and looked very professional. The second wave is image spam wrapped in a PDF, and has all the usual ugly spammer tricks in the image to try to make it unreadable by spam tools. What it mostly does is makes it unreadable by people, of course. Loren |
|
|
Re: Spam PDFJust another command sequence which worked well on a file containing an
image too: gs -sOutputFile=hugo -sDEVICE=pnmraw -dNOPAUSE -dBATCH -r600x600 hugo.pdf cat hugo | pamthreshold -simple -threshold 0.5 | pamtopnm | ocrad --format=utf8 This could be a base for another prep and scanset for FuzzyOcr. Just some ideas.... Claude |
|
|
Re: Spam PDF* Raymond Dijkxhoorn <raymond@...>:
> No i tested acroread but its not exactly a lightweight tool to do this > conversions. You can allmost better open the PDF and filter them manually ;) > > If you get a couple of thousand an hour, like we do now, it aint fun with > acroread. Why not use pdf2ascii? -- Ralf Hildebrandt (i.A. des IT-Zentrums) Ralf.Hildebrandt@... Charite - Universitätsmedizin Berlin Tel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-962 IT-Zentrum Standort CBF send no mail to plonk@... |
|
|
Re: Spam PDFOn 6/29/2007 1:27 PM, Ralf Hildebrandt wrote:
> * Raymond Dijkxhoorn <raymond@...>: > >> No i tested acroread but its not exactly a lightweight tool to do this >> conversions. You can allmost better open the PDF and filter them manually ;) >> >> If you get a couple of thousand an hour, like we do now, it aint fun with >> acroread. > > Why not use pdf2ascii? > Why not use PDFinfo? |
|
|
Re: Spam PDFOn Fri, 2007-06-29 at 12:58 +0200, Claude Frantz wrote:
> I was able to decode to plain text using the following commands: > > cat report.pdf | acroread -toPostScript -level2 -saveVM | ps2ascii > > Finally, very simple. Don't forget to filter escapes, or you might get a .pdf that includes somethin' nasty (like a cd /; rm -rf *). :) -- - Andy I myself am made entirely of flaws, stitched together with good intentions. - Augusten Burroughs |
| < Prev | 1 - 2 - 3 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |