Recovering a busted ferret db?

View: New views
6 Messages — Rating Filter:   Alert me  

Recovering a busted ferret db?

by Steven Walter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I think my ferret db is corrupted.  Whenever I do a tag search for a
specific tag, sup loads 2 messages and then hangs.  Actually, it's
using 100% CPU, but I have to kill -9 it; ctrl-c doesn't work.  Is
there any hope of fixing this without losing all my tag information?
--
-Steven Walter <stevenrwalter@...>
_______________________________________________
sup-talk mailing list
sup-talk@...
http://rubyforge.org/mailman/listinfo/sup-talk

Re: Recovering a busted ferret db?

by William Morgan-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Reformatted excerpts from Steven Walter's message of 2009-10-30:
> I think my ferret db is corrupted.  Whenever I do a tag search for a
> specific tag, sup loads 2 messages and then hangs.  Actually, it's
> using 100% CPU, but I have to kill -9 it; ctrl-c doesn't work.  Is
> there any hope of fixing this without losing all my tag information?

You can certainly move to the Xapian index without losing all of your
tags. Sup-dump will output everything precious from your index.

Is it possible you have a very large thread with that label? E.g.
thousands of messages, all replying to each other, from some script?
--
William <wmorgan-sup@...>
_______________________________________________
sup-talk mailing list
sup-talk@...
http://rubyforge.org/mailman/listinfo/sup-talk

Re: Recovering a busted ferret db?

by Steven Walter :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Oct 30, 2009 at 5:48 PM, William Morgan
<wmorgan-sup@...> wrote:
> Reformatted excerpts from Steven Walter's message of 2009-10-30:
>> I think my ferret db is corrupted.  Whenever I do a tag search for a
>> specific tag, sup loads 2 messages and then hangs.  Actually, it's
>> using 100% CPU, but I have to kill -9 it; ctrl-c doesn't work.  Is
>> there any hope of fixing this without losing all my tag information?
>
> You can certainly move to the Xapian index without losing all of your
> tags. Sup-dump will output everything precious from your index.

Looks like sup-dump hangs, too.

> Is it possible you have a very large thread with that label? E.g.
> thousands of messages, all replying to each other, from some script?

Not very likely.  I could believe tens, up to a hundred; 200 at the most.

I am able to Ctrl-C sup-dump when it hangs.  Here's the ruby backtrace:

/var/lib/gems/1.8/gems/sup-0.9/lib/sup/util.rb:206:in `split': Interrupt
        from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/util.rb:206:in
`split_on_commas'
        from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/person.rb:108:in
`from_address_list'
        from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/message.rb:104:in
`parse_header'
        from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/ferret_index.rb:276:in
`build_message'
        from /usr/lib/ruby/1.8/monitor.rb:242:in `synchronize'
        from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/ferret_index.rb:256:in
`build_message'
        from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/index.rb:150:in
`each_message'
        from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/ferret_index.rb:319:in
`each_id'
        from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/ferret_index.rb:319:in `map'
        from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/ferret_index.rb:319:in
`each_id'
        from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/index.rb:149:in
`each_message'
        from /var/lib/gems/1.8/gems/sup-0.9/bin/sup-dump:28

--
-Steven Walter <stevenrwalter@...>
_______________________________________________
sup-talk mailing list
sup-talk@...
http://rubyforge.org/mailman/listinfo/sup-talk

Re: Recovering a busted ferret db?

by William Morgan-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Reformatted excerpts from Steven Walter's message of 2009-10-30:
> I am able to Ctrl-C sup-dump when it hangs.  Here's the ruby backtrace:
>
> /var/lib/gems/1.8/gems/sup-0.9/lib/sup/util.rb:206:in `split': Interrupt
>         from /var/lib/gems/1.8/gems/sup-0.9/lib/sup/util.rb:206:in
> `split_on_commas'

It looks like you have some crazy long recipient email in some list
that's triggering worst-case behavior in a regexp. Can you try again
after applying this patch, please? (And I'd be curious how long the
address list was, if you find out what message is triggering this.)

diff --git a/lib/sup/person.rb b/lib/sup/person.rb
index 4b1c80b..dbedc79 100644
--- a/lib/sup/person.rb
+++ b/lib/sup/person.rb
@@ -105,6 +105,10 @@ class Person
 
   def self.from_address_list ss
     return [] if ss.nil?
+    ## #split_on_commas has some bad behavior for long strings. so here we do
+    ## something nasty and just truncate the string at the nearest comma <= 500
+    ## characters.
+    ss = ss[0, ss.rindex(",", 500)] if ss.length > 500
     ss.split_on_commas.map { |s| self.from_address s }
   end
 
--
William <wmorgan-sup@...>
_______________________________________________
sup-talk mailing list
sup-talk@...
http://rubyforge.org/mailman/listinfo/sup-talk

Parent Message unknown Re: Recovering a busted ferret db?

by William Morgan-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[cc'ing list]

Reformatted excerpts from Steven Walter's message of 2009-11-01:
> Hooray, that worked.  There is an email with an exceptionally long
> recipient list with the tag in question, and I received it around the
> time I first noticed the bad behavior in sup.  The Cc: field is 6878
> bytes long, containing 281 email addresses.

Weird, because I can call split_on_commas on a string that's that size,
with that many commas, in a few milliseconds. Must be something strange
about that particular string that's causing the worst-case behavior.

I'm going to apply the patch, but with a bigger limit.
--
William <wmorgan-sup@...>
_______________________________________________
sup-talk mailing list
sup-talk@...
http://rubyforge.org/mailman/listinfo/sup-talk

Re: Recovering a busted ferret db?

by William Morgan-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Reformatted excerpts from William Morgan's message of 2009-11-02:
> I'm going to apply the patch, but with a bigger limit.

Actually, strike that. The patch would mean we effectively limit the
number of recipients in a to, cc, or bcc field, which seems crazy. It
would be better to reimplement the regexp as a little state machine, as
irritating as that might be.

Steven, can you privately send me the header that's causing this? You
can obscure it somewhat if you want, just leave the quotes and commas
alone.
--
William <wmorgan-sup@...>
_______________________________________________
sup-talk mailing list
sup-talk@...
http://rubyforge.org/mailman/listinfo/sup-talk