Handling large log files

View: New views
12 Messages — Rating Filter:   Alert me  

Handling large log files

by Nate Hausrath :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello everyone,

I have a central log server set up in our environment that would
receive around 200-300 MB of messages per day from various devices
(switches, routers, firewalls, etc).  With this volume, logcheck was
able to effectively parse the files and send out a nice email.  Now,
however, the volume has increased to around 3-5 GB per day and will
continue growing as we add more systems.  Unfortunately, the old
logcheck solution now spends hours trying to parse the logs, and even
if it finishes, it will generate an email that is too big to send.

I'm somewhat new to log management, and I've done quite a bit of
googling for solutions.  However, my problem is that I just don't have
enough experience to know what I need.  Should I try to work with
logcheck/logsentry in hopes that I can improve its efficiency more?
Should I use filters on syslog-ng to cut out some of the messages I
don't want to see as they reach the box?

I have also thought that it would be useful to cut out all the
duplicate messages and just simply report on the number of times per
day I see each message.  After this, it seems likely that logcheck
would be able to effectively parse through the remaining logs and
report the items that I need to see (as well as new messages that
could be interesting).

Are there other solutions that would be better suited to log volumes
like this?  Should I look at commercial products?

Any comments/criticisms/suggestions would be greatly appreciated!
Please let me know if I need to provide more information.  Again, my
lack of experience in this area causes me hesitant to make a solid
decision without asking for some guidance first.  I don't want to
spend a lot of time going in one direction, only to find that I was
completely wrong.

Thanks!
Nate
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by Marcin Antkiewicz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I have a central log server set up in our environment that would
> receive around 200-300 MB of messages per day from various devices
> (switches, routers, firewalls, etc).  With this volume, logcheck was
> able to effectively parse the files and send out a nice email.  Now,
> however, the volume has increased to around 3-5 GB per day and will
> continue growing as we add more systems.  Unfortunately, the old
> logcheck solution now spends hours trying to parse the logs, and even
> if it finishes, it will generate an email that is too big to send.

Hi Nate,

I will offer a few general suggestions. You should be able to reclaim some
capacity from the system with review of the overall logging architecture and
your rule/reporting configuration. Harness the project's mailing list, and try
to profile your system in the hope of identifying easily addressed bottlenecks.

- log volume increase by an order of magnitude usually means that the
complexity of the environment quadruples. At this point, a 10% increase
in the size of your environment adds an equivalent of the original log flow.
I assume that, with adding more machines, the environment is getting more
standardized, but might have to look for a bigger tool.

- unless you have already done so, I would try to optimize the
ruleset. Make sure
that the logs go through as few regular expressions as posisble. With GB/day
of text, the cost of the extra evaluations compounds. Following the same logic,
investigate potential rewriting the most used, or the most expensive
rules. Try to
squeeze as much capacity from your install as possible.

- profile the machines, make sure that disk/network IO keeps up, that CPUs are
not running at 100% at all times, etc. This will let you identify
bottlenecks, and
further extend the live of your current system.

- scrap the existing reports. Write down the list of requirements, and
the nice-to-haves,
the scope and the needed level of details, and write new reports (you
should be able
to reuse most of the original work).

- see if the architecture can be improved. Can you use multiple log
servers? Is there
a logical way of segmenting the log traffic - OS to box 1, db
transactions to box 2, etc.?
Post to the project's mailing list, there should be people who use it
for larger installations,
and willing/able to provide specific suggestions.

- commercial tools should be able to keep up with 2gb/day without much
effort, but
every one will take considerable time to set up and tune. The vendors
will claim that it's
2 day setup and a week of rule setup, etc, but I would consider
planning for a quarter long
mid-intensity project. The end result should be useful dashboards and
reports that make
sense. I would set aside at least $20k, but that will be very
dependent on your environment.
Some products have reporting/integration plugins costing that much.

My team logs 2-3gb through Splunk, with no performance issues of any
kind (nice box
8cores/8gb ram). With a bit of careful planning, I expect to put quite
a bit more through it
in the near future.

--
Marcin Antkiewicz
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by Paul Melson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, May 5, 2009 at 6:41 PM, Nate Hausrath <hausrath@...> wrote:

> Hello everyone,
>
> I have a central log server set up in our environment that would
> receive around 200-300 MB of messages per day from various devices
> (switches, routers, firewalls, etc).  With this volume, logcheck was
> able to effectively parse the files and send out a nice email.  Now,
> however, the volume has increased to around 3-5 GB per day and will
> continue growing as we add more systems.  Unfortunately, the old
> logcheck solution now spends hours trying to parse the logs, and even
> if it finishes, it will generate an email that is too big to send.
>
[...][
> Are there other solutions that would be better suited to log volumes
> like this?  Should I look at commercial products?
>
> Any comments/criticisms/suggestions would be greatly appreciated!
> Please let me know if I need to provide more information.  Again, my
> lack of experience in this area causes me hesitant to make a solid
> decision without asking for some guidance first.  I don't want to
> spend a lot of time going in one direction, only to find that I was
> completely wrong.


What are you trying to achieve with your log analysis, as in, what
sort of actions would the review of this daily log report trigger?
Would you want to or should you move to a model where search/analysis
is happening in near-real time instead of once daily?  That's going to
be helpful in knowing what kind of solution you should be looking at.
Also, while it's overpowering your logcheck scripts, 5GB/day of log
data is nothing when you're talking about firewall logs.

PaulM
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by david@lang.hm :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 5 May 2009, Nate Hausrath wrote:

> Hello everyone,
>
> I have a central log server set up in our environment that would
> receive around 200-300 MB of messages per day from various devices
> (switches, routers, firewalls, etc).  With this volume, logcheck was
> able to effectively parse the files and send out a nice email.  Now,
> however, the volume has increased to around 3-5 GB per day and will
> continue growing as we add more systems.  Unfortunately, the old
> logcheck solution now spends hours trying to parse the logs, and even
> if it finishes, it will generate an email that is too big to send.
>
> I'm somewhat new to log management, and I've done quite a bit of
> googling for solutions.  However, my problem is that I just don't have
> enough experience to know what I need.  Should I try to work with
> logcheck/logsentry in hopes that I can improve its efficiency more?
> Should I use filters on syslog-ng to cut out some of the messages I
> don't want to see as they reach the box?
>
> I have also thought that it would be useful to cut out all the
> duplicate messages and just simply report on the number of times per
> day I see each message.  After this, it seems likely that logcheck
> would be able to effectively parse through the remaining logs and
> report the items that I need to see (as well as new messages that
> could be interesting).
>
> Are there other solutions that would be better suited to log volumes
> like this?  Should I look at commercial products?

I don't like the idea of filtering out messages completely, the number of
times that an otherwise 'unintersting' message shows up can be significant
(if the number of requests for a web image per day suddenly jumps to 100
times what it was before, that's a significant thing to know)

the key is to categorize and summarize the data. I have not found a good
commercial tool to do this job (there are good tools for drilling down and
querying the logs), the task of summarizing the data is just too site
specific. I currently get 40-80G of logs per day and have a nightly
process that summarizes them.

I first have a process (perl script) that goes through the logs and splits
them into seperate files based on the program name in the logs. Internally
it does a lookup of the program name to a bucket name and then outputs the
message to that bucket (this lets be combine all the mail logs to one
file, no matter which OS they are from and all the different ways that the
mail software identifies itself). for things that I haven't defined a
specific bucket for, I have a bucket called 'other'

I then run seperate processes against each of these buckets to create
summary reports of the information in that bucket. some of these processes
are home-grown scripts, some are log summary scripts that came with
specific programs.

one of the reports is how mnay log messages there are in each bucket (this
report is generated by my splitlogs program)

for the 'other' bucket, I have a sed line from hell that filters out
'unintersting' details in the log messages (timestamps, port numbers, etc)
and then run them through a sort|uniq -c |sort -rn to produce a report
that shows how many times a log message that looks like this shows up (the
sed line works hard to collaps similar messages togeather)

I then have a handful of scripts that assemble e-mails from these reports
(different e-mails reporting on different things going to different
groups). For a lot of the summaries I don't put the entire report in the
e-mail, but instead just do a head -X (X=20-50 in many cases) to show the
most common items.

for example, I have a report that shows all the websites that were hit by
people on the desktop network. I have another report that shows the hits
by desktop -> website. I generate an e-mail showing the top 50 entries in
each of these reports and send it to the folks looking for unusual
activity on the desktop network (it's amazing how accuratly a simple
report like this can pinpoint a problem desktop machine)

getting this setup takes a bit of time and tuning, but with a bit of
effort you can quickly knock out a LOT of your messages, and then you
start finding interesting things (machines that are misconfigured and
generating errors on a regular basis, etc). as you fix some of these
problems, the other report goes from an overwelming tens of thousands of
lines, to a much smaller report. just concentrate on killing the big items
and don't try to deal with the entire report at once (the nightly e-mail
to me shows the top several hundred lines of this report so that I can
work on tuning it. when I can keep up on the tuning it's not unusual for
this to be the entire report)

with this approach (and a reasonably beefy log reporting machine), it
takes about 3-6 hours to generate the report (6 hours being the 80G days)

I have other tools watch the logs in real-time for known bad things (to
generate alerts), and am installing splunk to let me go searching in the
logs when I find something in the reports that I want to investigate
further (with this sort of log volume, just doing a grep through the logs
can take days)

hope this helps.

David Lang
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by Nate Hausrath :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

First, thanks for the great responses!  Aside from the fact that we
need a beefier system (2x P3 1.4 GHz, 3 GB RAM, RAID-5... ouch), it
looks like I have a lot of work to do.

Also, thanks for providing some idea of the specs I will need to use
for a central log server.  I believe our goal is to have around 300
servers sending logs (most of them should be less chatty than the
current ones).  If you don't mind me asking, roughly how many servers
should I expect to have generate 1 GB of logs?  I realize there really
isn't an accurate answer here, but I'm trying to get a rough ballpark
figure.

> Marcin wrote:
>
> - see if the architecture can be improved. Can you use multiple log
> servers? Is there
> a logical way of segmenting the log traffic - OS to box 1, db
> transactions to box 2, etc.?
> Post to the project's mailing list, there should be people who use it
> for larger installations,
> and willing/able to provide specific suggestions.

I'll see if this is an option.  Along these lines, I'd eventually like
to be able to turn log messages into events and be able to correlate
them with other messages, IDS alerts, etc.  I think that once I
compress the duplicates, and get rid of a lot of noise, I could
forward the results to an OSSIM box and use it for correlation,
alerts, etc.

> Paul wrote:
>
> What are you trying to achieve with your log analysis, as in, what
> sort of actions would the review of this daily log report trigger?
> Would you want to or should you move to a model where search/analysis
> is happening in near-real time instead of once daily?  That's going to
> be helpful in knowing what kind of solution you should be looking at.
> Also, while it's overpowering your logcheck scripts, 5GB/day of log
> data is nothing when you're talking about firewall logs.
>
> PaulM

We are primarily looking for security related events.  Real time
analysis/reporting of events is an eventual goal, but that seems a lot
more difficult to do in some regards.  Initially, I'd like to at least
have a summary I can look at daily (probably along the lines of what
David posted below) and then I could transition to more real-time
analysis.  Does that sound reasonable?

> David wrote:
>
> I don't like the idea of filtering out messages completely, the number of
> times that an otherwise 'unintersting' message shows up can be significant
> (if the number of requests for a web image per day suddenly jumps to 100
> times what it was before, that's a significant thing to know)

Duly noted.  Thanks!

>
> the key is to categorize and summarize the data. I have not found a good
> commercial tool to do this job (there are good tools for drilling down and
> querying the logs), the task of summarizing the data is just too site
> specific. I currently get 40-80G of logs per day and have a nightly process
> that summarizes them.

This is good to know as well.  I'd like to avoid commercial tools if
possible to save money (although Splunk seems pretty darn useful).

>
> *Solid plan of attack from David*
>

Thanks for all the great information from everyone.  I'll be jumping
into this today!

-Nate
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by Gayathri Swaminathan-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hey Nate,

Have used syslog-ng along with splunk, which improved log review immensely.

Splunk is free for indexing up to 500 MB/day

good luck!
gayathri
________________________________________
From: firewall-wizards-bounces@... [firewall-wizards-bounces@...] On Behalf Of Nate Hausrath [hausrath@...]
Sent: Tuesday, May 05, 2009 5:41 PM
To: firewall-wizards@...
Subject: [fw-wiz] Handling large log files

Hello everyone,

I have a central log server set up in our environment that would
receive around 200-300 MB of messages per day from various devices
(switches, routers, firewalls, etc).  With this volume, logcheck was
able to effectively parse the files and send out a nice email.  Now,
however, the volume has increased to around 3-5 GB per day and will
continue growing as we add more systems.  Unfortunately, the old
logcheck solution now spends hours trying to parse the logs, and even
if it finishes, it will generate an email that is too big to send.

I'm somewhat new to log management, and I've done quite a bit of
googling for solutions.  However, my problem is that I just don't have
enough experience to know what I need.  Should I try to work with
logcheck/logsentry in hopes that I can improve its efficiency more?
Should I use filters on syslog-ng to cut out some of the messages I
don't want to see as they reach the box?

I have also thought that it would be useful to cut out all the
duplicate messages and just simply report on the number of times per
day I see each message.  After this, it seems likely that logcheck
would be able to effectively parse through the remaining logs and
report the items that I need to see (as well as new messages that
could be interesting).

Are there other solutions that would be better suited to log volumes
like this?  Should I look at commercial products?

Any comments/criticisms/suggestions would be greatly appreciated!
Please let me know if I need to provide more information.  Again, my
lack of experience in this area causes me hesitant to make a solid
decision without asking for some guidance first.  I don't want to
spend a lot of time going in one direction, only to find that I was
completely wrong.

Thanks!
Nate
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by david@lang.hm :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 6 May 2009, Nate Hausrath wrote:

> First, thanks for the great responses!  Aside from the fact that we
> need a beefier system (2x P3 1.4 GHz, 3 GB RAM, RAID-5... ouch), it
> looks like I have a lot of work to do.

raid 5 is not nessasarily a problem.

one surprise I ran into when configuring my splunk systems is that for
read-only situations, raid 5/6 can be as fast as raid 0, the big overhead
of raid 5/6 is when you are writing data.

so what I do is have the incoming logs written to one disk (pair of
mirrord drives), indexed there, and once all the work is done it gets
copied to the raid6 array, and that array is otherwise read-only

> Also, thanks for providing some idea of the specs I will need to use
> for a central log server.  I believe our goal is to have around 300
> servers sending logs (most of them should be less chatty than the
> current ones).  If you don't mind me asking, roughly how many servers
> should I expect to have generate 1 GB of logs?  I realize there really
> isn't an accurate answer here, but I'm trying to get a rough ballpark
> figure.

this depends so much on your systems that any answer is pretty
meaningless.

in the absense of other information, I would just extrapolate from your
current systems

>> Marcin wrote:
>>
>> - see if the architecture can be improved. Can you use multiple log
>> servers? Is there
>> a logical way of segmenting the log traffic - OS to box 1, db
>> transactions to box 2, etc.?
>> Post to the project's mailing list, there should be people who use it
>> for larger installations,
>> and willing/able to provide specific suggestions.
>
> I'll see if this is an option.  Along these lines, I'd eventually like
> to be able to turn log messages into events and be able to correlate
> them with other messages, IDS alerts, etc.  I think that once I
> compress the duplicates, and get rid of a lot of noise, I could
> forward the results to an OSSIM box and use it for correlation,
> alerts, etc.

this gets a lot harder than you think, but you don't nessasarily need to
pre-filter the logs, the correlattion engines are going to be doing regex
matching on the logs themselves.

>> David wrote:
>>
>> the key is to categorize and summarize the data. I have not found a good
>> commercial tool to do this job (there are good tools for drilling down and
>> querying the logs), the task of summarizing the data is just too site
>> specific. I currently get 40-80G of logs per day and have a nightly process
>> that summarizes them.
>
> This is good to know as well.  I'd like to avoid commercial tools if
> possible to save money (although Splunk seems pretty darn useful).

you can do everything with free tools, it's just a matter of manpower ;-)

for nightly reports, you can use the plan I listed

for alert generation and event correlation, look at SEC (simple event
correlator)

the part that is hard to do on the cheap is to efficiantly be able to
search the logs.

if you have an idea of what you are looking for ahead of time, you can
split the logs into different files for different types of events, then
just search the subset of items, but if you don't anticipate things, you
end up needing to do a full-text search through your logs. Postgres does
have good full-text indexing capabilities, but as you grow you will get to
the point where it takes more than one machine to get an answer back in a
reasonable amount of time (just due to the fact that you have so much
index data to search through to find where to go for the real data), and
at that point you need some sort of clustered datastore. those aren't
cheap, (even for the commercial version of postgres), and if you haven't
already figured out how to do this, there is a lot of value in buying one
of the commercial solutions that have that stuff more-or-less figured out
for you.

David Lang
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by Marcus J. Ranum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In case anyone wants 'em, my old USENIX system logging
tutorial and notes are downloadable here:
http://www.ranum.com/security/computer_security/archives/logging-notes.pdf
It's a 100+ page book of everything I know/managed to figure out about
logging.

mjr.
--
Marcus J. Ranum CSO, Tenable Network Security, Inc.
                        http://www.tenablesecurity.com
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by hugh.fraser :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Like others have mentioned in previous replies, we've used syslog-ng and
Splunk to manage firewall and switch event logs. But sometimes we've
wanted to detect behaviour or anomalies that can't be done easily with
the tools. For these, I've used SEC (Simple Event Correlation), and perl
script from:

http://kodu.neti.ee/~risto/sec/

During the replacement of our campus network when lots of inter-switch
dependency issues arose, we used it to alert us to switches reporting an
error that hadn't had any problems for the past 5 days, usually
indicating something had happened externally to affect it, or to events
that were new in the past 5 days. We also used it to identify things
like links bouncing (down/up/down within a certain period of time). The
output of SEC was fed back in to syslog-ng as and represented in Splunk
as "synthetic" events, for which we had special notification and
reporting.

The goal of the process was to do exception reporting, allowing us to
collect all the events but only be notified when certain criteria
occurred.

 

-----Original Message-----
From: firewall-wizards-bounces@...
[mailto:firewall-wizards-bounces@...] On Behalf Of
Nate Hausrath
Sent: Tuesday, May 05, 2009 6:41 PM
To: firewall-wizards@...
Subject: [fw-wiz] Handling large log files

Hello everyone,

I have a central log server set up in our environment that would receive
around 200-300 MB of messages per day from various devices (switches,
routers, firewalls, etc).  With this volume, logcheck was able to
effectively parse the files and send out a nice email.  Now, however,
the volume has increased to around 3-5 GB per day and will continue
growing as we add more systems.  Unfortunately, the old logcheck
solution now spends hours trying to parse the logs, and even if it
finishes, it will generate an email that is too big to send.

I'm somewhat new to log management, and I've done quite a bit of
googling for solutions.  However, my problem is that I just don't have
enough experience to know what I need.  Should I try to work with
logcheck/logsentry in hopes that I can improve its efficiency more?
Should I use filters on syslog-ng to cut out some of the messages I
don't want to see as they reach the box?

I have also thought that it would be useful to cut out all the duplicate
messages and just simply report on the number of times per day I see
each message.  After this, it seems likely that logcheck would be able
to effectively parse through the remaining logs and report the items
that I need to see (as well as new messages that could be interesting).

Are there other solutions that would be better suited to log volumes
like this?  Should I look at commercial products?

Any comments/criticisms/suggestions would be greatly appreciated!
Please let me know if I need to provide more information.  Again, my
lack of experience in this area causes me hesitant to make a solid
decision without asking for some guidance first.  I don't want to spend
a lot of time going in one direction, only to find that I was completely
wrong.

Thanks!
Nate
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards


_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by Nate Hausrath :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks for the suggestions.  I'll definitely check out SEC as well.

And thanks to everyone for their input and help.  If I come up with
anything during this process that may be interesting or helpful to
others, I'll be sure to post it somewhere.

-Nate

On Wed, May 6, 2009 at 3:56 PM,  <hugh.fraser@...> wrote:

> Like others have mentioned in previous replies, we've used syslog-ng and
> Splunk to manage firewall and switch event logs. But sometimes we've
> wanted to detect behaviour or anomalies that can't be done easily with
> the tools. For these, I've used SEC (Simple Event Correlation), and perl
> script from:
>
> http://kodu.neti.ee/~risto/sec/
>
> During the replacement of our campus network when lots of inter-switch
> dependency issues arose, we used it to alert us to switches reporting an
> error that hadn't had any problems for the past 5 days, usually
> indicating something had happened externally to affect it, or to events
> that were new in the past 5 days. We also used it to identify things
> like links bouncing (down/up/down within a certain period of time). The
> output of SEC was fed back in to syslog-ng as and represented in Splunk
> as "synthetic" events, for which we had special notification and
> reporting.
>
> The goal of the process was to do exception reporting, allowing us to
> collect all the events but only be notified when certain criteria
> occurred.
>
>
>
> -----Original Message-----
> From: firewall-wizards-bounces@...
> [mailto:firewall-wizards-bounces@...] On Behalf Of
> Nate Hausrath
> Sent: Tuesday, May 05, 2009 6:41 PM
> To: firewall-wizards@...
> Subject: [fw-wiz] Handling large log files
>
> Hello everyone,
>
> I have a central log server set up in our environment that would receive
> around 200-300 MB of messages per day from various devices (switches,
> routers, firewalls, etc).  With this volume, logcheck was able to
> effectively parse the files and send out a nice email.  Now, however,
> the volume has increased to around 3-5 GB per day and will continue
> growing as we add more systems.  Unfortunately, the old logcheck
> solution now spends hours trying to parse the logs, and even if it
> finishes, it will generate an email that is too big to send.
>
> I'm somewhat new to log management, and I've done quite a bit of
> googling for solutions.  However, my problem is that I just don't have
> enough experience to know what I need.  Should I try to work with
> logcheck/logsentry in hopes that I can improve its efficiency more?
> Should I use filters on syslog-ng to cut out some of the messages I
> don't want to see as they reach the box?
>
> I have also thought that it would be useful to cut out all the duplicate
> messages and just simply report on the number of times per day I see
> each message.  After this, it seems likely that logcheck would be able
> to effectively parse through the remaining logs and report the items
> that I need to see (as well as new messages that could be interesting).
>
> Are there other solutions that would be better suited to log volumes
> like this?  Should I look at commercial products?
>
> Any comments/criticisms/suggestions would be greatly appreciated!
> Please let me know if I need to provide more information.  Again, my
> lack of experience in this area causes me hesitant to make a solid
> decision without asking for some guidance first.  I don't want to spend
> a lot of time going in one direction, only to find that I was completely
> wrong.
>
> Thanks!
> Nate
> _______________________________________________
> firewall-wizards mailing list
> firewall-wizards@...
> https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards
>
>
> _______________________________________________
> firewall-wizards mailing list
> firewall-wizards@...
> https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards
>
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by Sai-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have been using rsyslog (as opposed to syslog-ng) and found it to be
quite useful. It is under very active development and the main
developer is REALLY into logs.

sai

On Thu, May 7, 2009 at 12:56 AM,  <hugh.fraser@...> wrote:

> Like others have mentioned in previous replies, we've used syslog-ng and
> Splunk to manage firewall and switch event logs. But sometimes we've
> wanted to detect behaviour or anomalies that can't be done easily with
> the tools. For these, I've used SEC (Simple Event Correlation), and perl
> script from:
>
> http://kodu.neti.ee/~risto/sec/
>
> During the replacement of our campus network when lots of inter-switch
> dependency issues arose, we used it to alert us to switches reporting an
> error that hadn't had any problems for the past 5 days, usually
> indicating something had happened externally to affect it, or to events
> that were new in the past 5 days. We also used it to identify things
> like links bouncing (down/up/down within a certain period of time). The
> output of SEC was fed back in to syslog-ng as and represented in Splunk
> as "synthetic" events, for which we had special notification and
> reporting.
>
> The goal of the process was to do exception reporting, allowing us to
> collect all the events but only be notified when certain criteria
> occurred.
>
>
>
> -----Original Message-----
> From: firewall-wizards-bounces@...
> [mailto:firewall-wizards-bounces@...] On Behalf Of
> Nate Hausrath
> Sent: Tuesday, May 05, 2009 6:41 PM
> To: firewall-wizards@...
> Subject: [fw-wiz] Handling large log files
>
> Hello everyone,
>
> I have a central log server set up in our environment that would receive
> around 200-300 MB of messages per day from various devices (switches,
> routers, firewalls, etc).  With this volume, logcheck was able to
> effectively parse the files and send out a nice email.  Now, however,
> the volume has increased to around 3-5 GB per day and will continue
> growing as we add more systems.  Unfortunately, the old logcheck
> solution now spends hours trying to parse the logs, and even if it
> finishes, it will generate an email that is too big to send.
>
> I'm somewhat new to log management, and I've done quite a bit of
> googling for solutions.  However, my problem is that I just don't have
> enough experience to know what I need.  Should I try to work with
> logcheck/logsentry in hopes that I can improve its efficiency more?
> Should I use filters on syslog-ng to cut out some of the messages I
> don't want to see as they reach the box?
>
> I have also thought that it would be useful to cut out all the duplicate
> messages and just simply report on the number of times per day I see
> each message.  After this, it seems likely that logcheck would be able
> to effectively parse through the remaining logs and report the items
> that I need to see (as well as new messages that could be interesting).
>
> Are there other solutions that would be better suited to log volumes
> like this?  Should I look at commercial products?
>
> Any comments/criticisms/suggestions would be greatly appreciated!
> Please let me know if I need to provide more information.  Again, my
> lack of experience in this area causes me hesitant to make a solid
> decision without asking for some guidance first.  I don't want to spend
> a lot of time going in one direction, only to find that I was completely
> wrong.
>
> Thanks!
> Nate
> _______________________________________________
> firewall-wizards mailing list
> firewall-wizards@...
> https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards
>
>
> _______________________________________________
> firewall-wizards mailing list
> firewall-wizards@...
> https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards
>
_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards

Re: Handling large log files

by Gyöngyösi Péter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

(Disclaimer: I work for BalaBit, the company behind syslog-ng.)

Nate Hausrath wrote:

> Hello everyone,
>
> I have a central log server set up in our environment that would
> receive around 200-300 MB of messages per day from various devices
> (switches, routers, firewalls, etc).  With this volume, logcheck was
> able to effectively parse the files and send out a nice email.  Now,
> however, the volume has increased to around 3-5 GB per day and will
> continue growing as we add more systems.  Unfortunately, the old
> logcheck solution now spends hours trying to parse the logs, and even
> if it finishes, it will generate an email that is too big to send.
>  
The others have given lots of useful tips about log handling, but if
you're just having perfomance issues with logcheck, you should have a
look at the db-parser feature in the new syslog-ng 3.0.

The best places to find out more about it are these blog posts:

http://marci.blogs.balabit.com/2009/04/db-parser-high-speed-log-message-parser.html
http://marci.blogs.balabit.com/2009/04/intorduction-to-parser-in-syslog-ng-db.html
http://bazsi.blogs.balabit.com/2008/10/syslog-ng-message-parsing.html

It's able to handle (that means, classify based on log message contents,
filter based on this classification and store or forward) this kind of
traffic on commodity hardware. A ready-to-use pattern database converted
from logcheck's regexp list and for Cisco PIX messages can be downloaded
from the website and it's quite easy to write your own rules (the blog
posts mentioned above contain good examples).


Peter

_______________________________________________
firewall-wizards mailing list
firewall-wizards@...
https://listserv.icsalabs.com/mailman/listinfo/firewall-wizards