|
View:
New views
7 Messages
—
Rating Filter:
Alert me
|
|
|
short log with dcc
by Bokhan Artem-2
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message Hello.
I want to find out, if there a way (may be dirty one) to log to file or syslog "email_address message-id checksum_type checksum" fields of messages, passed through dccm+dccd, without logging the whole body? With help of feedback from users ("this is spam" button) I want to use this log to find and mark messages (which are already sent to user mailboxes) with spam flag. If there is no standard way, could anybody point me the best place (may be variables names) I could inject my own code into? Any other help is also appreciated! _______________________________________________ DCC mailing list DCC@... http://www.rhyolite.com/mailman/listinfo/dcc |
|
|
Re: short log with dcc
by Vernon Schryver
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message > From: Artem Bokhan <aptem@...>
> I want to find out, if there a way (may be dirty one) to log to file or > syslog "email_address message-id checksum_type checksum" fields of > messages, passed through dccm+dccd, without logging the whole body? > > With help of feedback from users ("this is spam" button) I want to use > this log to find and mark messages (which are already sent to user > mailboxes) with spam flag. > > If there is no standard way, could anybody point me the best place (may > be variables names) I could inject my own code into? Any other help is > also appreciated! What is the purpose of not logging the entire message body? Are you trying to minimize disk space used for log files or are there privacy issues? Building dccm with `./configure --with-max-log-size=1` would limit log files to 1 KByte of message body. For a "this is spam" button, I would use something like the "this is not spam; stop greylist" button in proof-of-concept cgi scripts in the DCC source. That mechanism feeds checksum lines from log files to the dccsight program. Note that message-IDs are not a reliable key for incoming mail messages. Not only does plenty of spam lack message-ID headers, but so does mail from systems using qmail. If you use dccm+sendmail, your users will see message-ID headers in all mail, but only because sendmail will have added them. Because sendmail adds the message-ID headers after dccm sees the message, they will not be in dccm log files. Note also that sendmail IDs in syslog are mostly distinct from SMTP message-IDs. Vernon Schryver vjs@... _______________________________________________ DCC mailing list DCC@... http://www.rhyolite.com/mailman/listinfo/dcc |
|
|
Re: short log with dcc
by Bokhan Artem-2
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message > What is the purpose of not logging the entire message body? > Are you trying to minimize disk space used for log files or are there privacy issues? > Building dccm with `./configure --with-max-log-size=1` would > limit log files to 1 KByte of message body. > The reason is the waste of resources, servers are quite busy with email traffic. Writing files to disk is expensive (all stuff is in memory now, no any disk i/o), writing files into memory and frequent postprocessing them with script is an alternative, but it does not look elegant and needs more memory. > For a "this is spam" button, I would use something like the "this is > not spam; stop greylist" button in proof-of-concept cgi scripts in the > DCC source. That mechanism feeds checksum lines from log files to the > dccsight program. > I will look, thanks. > Note that message-IDs are not a reliable key for incoming mail > messages. Not only does plenty of spam lack message-ID headers, > but so does mail from systems using qmail. I understand that. Did not know about qmail. > If you use dccm+sendmail, > I use postfix+dccm, I do not know yet when postfix writes message-id, before or after milter. I do not see any other appropriate keys. Probably, I could create one with milter before dccm. Probably, the dcc checksum could be the key itself. > > Vernon Schryver vjs@... > Any advice about code hook place? _______________________________________________ DCC mailing list DCC@... http://www.rhyolite.com/mailman/listinfo/dcc |
|
|
Re: short log with dcc
by Vernon Schryver
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message > From: Bokhan Artem <APTEM@...>
> > Building dccm with `./configure --with-max-log-size=1` would > > limit log files to 1 KByte of message body. > > > The reason is the waste of resources, servers are quite busy with email > traffic. I don't think you have a local DCC server, and you have not attracted attention by using the public DCC servers to more than 100K msgs/day. Therefore it seems likely that your mail systems are handling fewer than 200K messages per day. 20 years ago 200K msgs/day was a big deal. (I'll spare you war stories of days when computers and networks were 1000 times and more slower.) Today 200K msgs/day is not trivial, but not worth mentioning. I now run spam traps that feed 30K spam/day through sendmail+dccm in about 1% of a cheap computer. If your mail system is quite busy with less than 200K msgs/day, it might pay to look at your other spam filters that use lots of CPU cycles such as DNSBLs, ClamAV, and SpamAssassin. > Writing files to disk is expensive (all stuff is in memory now, no any > disk i/o), > writing files into memory and frequent postprocessing them with script > is an alternative, > but it does not look elegant and needs more memory. If you don't have spare resources to write a 4K Byte log file, then you surely do not have the larger resources needed to fork(), exec(), parse, and run a script. Just creating the u area and the stack for the new process for the script probably involves more than 4KBytes of I/O (of course generally not to the disk). It is likely that there is no difference between writing a new log file of 100 bytes and writing a new log file of 4 KBytes, whether you use a memory file system or classic disk. Both will use at most data block and the same amount of inode and indirect I/O in a classic filesystem. In a journaling filesystem, you are also unlikely to be able to measure a difference between 100 bytes and 4 KBytes. Yes, I've encountered byte copy issues, bus occupancy, cache thrashing, and other issues. However, they don't apply to the relatively small amounts of data handled even by a busy mail system. > > If you use dccm+sendmail, > > > I use postfix+dccm, I do not know yet when postfix writes message-id, > before or after milter. How are you using postfix+dccm? That last time I checked, I found that the postfix milter interface incompatible with the sendmail milter interface as far as dccm is concerned. Why not use postfix with dccifd as a before-queue filter? That's the recommended DCC configuration with postfix. > Any advice about code hook place? The best thing about open source is that you can read the source and make needed changes. That is also the worst thing about open source. People with much experience try to make as few changes as if the source were secret. One reason is that local changes break the warrenty; admit that you've changed the code and you'll find that any and all problems you encounter are blamed on your changes. Another reason is that integrating local changes into the next version, the version after that, and the version after that, and so on is no fun at all after you've done it a few times. Over the decades, I've accumulated a big box of tools to make it easier to port my improvements to successive versions other people's programs. However, my most powerful and most often used tool today is resisting the urge to make changes. I predict that if you do change dccm, then in 6 months or a year from now you or your successor will discard those changes and probably stop using DCC. But of course, no one few who not been on the open source merrygoround for decades sees it that way. Vernon Schryver vjs@... _______________________________________________ DCC mailing list DCC@... http://www.rhyolite.com/mailman/listinfo/dcc |
|
|
Re: short log with dcc
by Bokhan Artem-2
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message Vernon Schryver пишет:
> I don't think you have a local DCC server, and you have not attracted > attention by using the public DCC servers to more than 100K msgs/day. > Therefore it seems likely that your mail systems are handling > fewer than 200K messages per day. > At the moment dcc is used for outgoing traffic only with local dcc server. Incoming traffic averages per day are: 10M of recipients, 4.5 M connections, 400K messages are passed to mailboxes. I do not use global DCC servers because commercial filter does checksum-based filtering job and does it well. But we have special type of spam oriented only for our users, it is the reason I started the topic. >> Writing files to disk is expensive (all stuff is in memory now, no any >> disk i/o), >> writing files into memory and frequent postprocessing them with script >> is an alternative, >> but it does not look elegant and needs more memory. >> > > If you don't have spare resources to write a 4K Byte log file, then you > surely do not have the larger resources needed to fork(), exec(), parse, > and run a script. > Just creating the u area and the stack for the new > process for the script probably involves more than 4KBytes of I/O (of > course generally not to the disk). > > It is likely that there is no difference between writing a new log file > of 100 bytes and writing a new log file of 4 KBytes, whether you > use a memory file system or classic disk. > Writes are buffered, so I believe short log is about 4k/100=40 times faster. > How are you using postfix+dccm? That last time I checked, I found > that the postfix milter interface incompatible with the sendmail milter > interface as far as dccm is concerned. > With current versions of postfix I tried a lot of different milters, they all work as they should. The only difference is you should always use extended smtp codes for replies. > Why not use postfix with dccifd as a before-queue filter? That's > the recommended DCC configuration with postfix. > > Milter is before-queue too. With milter it is easier to track connections as all log records for particular connection always has the same ID (inode name). Also it is easier to manage system because all other filters are milters too. > I predict that if you do change dccm, then in 6 months or a year from now you or your successor will discard those changes and probably stop using DCC. That is not my case, sorry :) _______________________________________________ DCC mailing list DCC@... http://www.rhyolite.com/mailman/listinfo/dcc |
|
|
Re: short log with dcc
by Vernon Schryver
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message > From: Bokhan Artem <APTEM@...>
> At the moment dcc is used for outgoing traffic only with local dcc server= > =2E > Incoming traffic averages per day are: 10M of recipients, 4.5 M=20 > connections, 400K messages are passed to mailboxes. > I do not use global DCC servers because commercial filter does=20 > checksum-based filtering job and does it well. > But we have special type of spam oriented only for our users, it is the=20 > reason I started the topic. The license on the free version of the DCC software clearly requires that you share the DCC checksums you compute with the rest of the world with these words: * This agreement is not applicable to any entity which sells anti-spam * solutions to others or provides an anti-spam solution as part of a * security solution sold to other entities, or to a private network * which employs the DCC or uses data provided by operation of the DCC * but does not provide corresponding data to other users. Because you are not sharing the checksums of the spam sent by your users, you are violating the license on the free DCC source. Please stop using the DCC software. Vernon Schryver vjs@... _______________________________________________ DCC mailing list DCC@... http://www.rhyolite.com/mailman/listinfo/dcc |
|
|
Re: short log with dcc
by Bokhan Artem-2
::
Rate this Message:
Reply (Restricted by the Administrator) | Reply to Author | View Threaded | Show Only this Message Vernon Schryver пишет:
> Because you are not sharing the checksums of the spam sent by your > users, you are violating the license on the free DCC source. Please > stop using the DCC software. > > The system is in "proof of concept" stage now. And your behavior does not look friendly. Instead of asking to share checksums you are asking to stop using DCC. Probably, this mail list is not the place where people are trying to help each other. Sorry if I caused inconvenience. > Vernon Schryver vjs@... > _______________________________________________ > DCC mailing list DCC@... > http://www.rhyolite.com/mailman/listinfo/dcc > _______________________________________________ DCC mailing list DCC@... http://www.rhyolite.com/mailman/listinfo/dcc |
| Free embeddable forum powered by Nabble | Forum Help |