does mhonarc do a directory listing?

View: New views
12 Messages — Rating Filter:   Alert me  

does mhonarc do a directory listing?

by Jeff Breidenbach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This question is a little esoteric.

I decided to try improving mhonarc's archiving speed with one of those
whiz-bang solid state drives from Intel. Unfortunately, they are
pretty low capacity and I can't fit all the data. So I decided to go
with a hybrid strategy; all new writes go to the SSD, but tons of
files still exist on the rotating rust. The way to do this is a union
mount, and Linux has tons implementation floating around. I chose one
called aufs. Unfortunately, aufs is not that great when listing
directory contents. I *think* mhonarc is doing a listdir operation as
a safety check to avoid clobbering existing message files. (I tried to
find out via strace, but managed instead to confuse myself). Is this
actually true, and if so, can the directory listing be reasonably
turned off?

$ time ls -U /mnt/rotating-rust | wc -l  # cached
1382438

real 0m2.224s
user 0m1.600s
sys 0m0.900s

$ time ls -U /mnt/whizzy-ssd | wc -l
10099

real 0m0.013s
user 0m0.000s
sys 0m0.010s

$ time ls  -U /mnt/the-unholy-union | wc -l
1391390

real 1m10.115s
user 0m0.540s
sys 1m9.430s


Re: does mhonarc do a directory listing?

by Earl Hood :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On April 1, 2009 at 21:00, Jeff Breidenbach wrote:

> I *think* mhonarc is doing a listdir operation as
> a safety check to avoid clobbering existing message files. (I tried to
> find out via strace, but managed instead to confuse myself). Is this
> actually true, and if so, can the directory listing be reasonably
> turned off?

Not w/o code modifications.

The directory scan is to determine what the last message number
is so when new messages get added, the message filenames will
continue from the last message number, and as you have already
noted, avoid overwriting any existing files.

One method to avoid the directory scan is to have mhonarc store
what the last message number was in the db.  When mhonarc
initializes itself when called again, it can use the value
from the db (if present) versus scanning the directory.

--ewh


Re: does mhonarc do a directory listing?

by Jeff Breidenbach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Great, that is a possible performance enhancement for the future, as
very large directories can be slow to read, even when totally cached
by Linux. For the immediate term, I'll talk with the aufs folks to see
if directory reads can get faster.

$ time ls -U > /dev/null   # cached

real 0m1.471s
user 0m0.530s
sys 0m0.940s


Re: does mhonarc do a directory listing?

by Jeff Breidenbach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ok, I found the code that does the directory listing in mhonarc,  by
searching for the word "max". Amongst other things, this dug up the
comments about the Atari MegaST used for the mhonarc logo.  As an
Atari 520ST user, it rekindled some nostalgic jealousy.

Anyway, it is get_last_msg_num and just for fun I logged how much time
it was spending, using timethis() from the Perl benchmarking module.
About two minutes spread over ~300 iterations of mhonarc on real data.
Now of course this is not a typical system;  I have a large number of
files per directory, and am currently using the aufs union mount
filesystem. Aufs a little bit slow to return directory listings in
such situations. I've bee talking to the aufs folks and there isn't a
trivial fix.

I think an option for mhonarc to avoid the directory listing would be
quite helpful, at least in my situation.

-Jeff


Re: does mhonarc do a directory listing?

by Christopher P. Lindsey :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I think an option for mhonarc to avoid the directory listing would be
> quite helpful, at least in my situation.

I would love to have access to the last message number after a MHonArc
run.

I've been working on an SQL-based message store that is used entirely
for searching, displaying indexes, etc.  I still keep a MHonArc archive
of all messages and use its output for the actual message display.

What I would like to do is to maintain a cache of messages.  When a
message is requested it would be fed into MHonArc, the filename would
be returned, and then I could display that file and store it in my
database for future use (until it doesn't get used enough and the cache
gets purged).

It sounds like there are lots of reasons to display and/or store the
filename on exit.

I can work on a patch in the next few days unless Earl tells me to
back off because he's going to do it.  :)

Chris


Parent Message unknown Re: does mhonarc do a directory listing?

by Jym Dyer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>> I think an option for mhonarc to avoid the directory listing
>> would be quite helpful, at least in my situation.
> I would love to have access to the last message number after
> a MHonArc run.

=v= I handle a similar situation like this:  do a directory-wide
operation, then record the directory's modification date.  Next
time around, check the directory's modification date and skip
the operation if it hasn't changed.  In this situation, that way
you would know that the last message hasn't changed since the
previous operation, so you can skip the directory read.
    <_Jym_>


Re: does mhonarc do a directory listing?

by Earl Hood :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Apr 6, 2009 at 12:16 PM, Jym Dyer <jym@...> wrote:
> =v= I handle a similar situation like this:  do a directory-wide
> operation, then record the directory's modification date.  Next
> time around, check the directory's modification date and skip
> the operation if it hasn't changed.  In this situation, that way
> you would know that the last message hasn't changed since the
> previous operation, so you can skip the directory read.

Not sure this will work for all OSs.

Modifying code to remember last message number is
straight-forward and independent of any file system that
may be in use.

--ewh


Re: does mhonarc do a directory listing?

by Jeff Breidenbach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Happy Easter and ping... :)

> I can work on a patch in the next few days unless Earl tells me to
> back off because he's going to do it.  :)


Re: does mhonarc do a directory listing?

by Jeff Breidenbach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

How can I be most helpful?

-Jeff


Re: does mhonarc do a directory listing?

by Jeff Breidenbach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Modifying code to remember last message number is
> straight-forward and independent of any file system that
> may be in use.

Ok, to spice things up I'm offering a $300 bounty for whoever writes
the patch that gets accepted into the MhonArc codebase.


Re: does mhonarc do a directory listing?

by Christopher P. Lindsey :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> > Modifying code to remember last message number is
> > straight-forward and independent of any file system that
> > may be in use.
>
> Ok, to spice things up I'm offering a $300 bounty for whoever writes
> the patch that gets accepted into the MhonArc codebase.

OK, ok...  I put something together and sent it to you and Earl for review.
I'm not sure about the bounty though since I did say that I'd do this
before.

Chris


Re: does mhonarc do a directory listing?

by Jeff Breidenbach :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> OK, ok...  I put something together and sent it to you and Earl for review.
> I'm not sure about the bounty though since I did say that I'd do this
> before.

The patch is awesome and the bounty is yours. Very nice work.

-Jeff