|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
Xapian: Term too longsup-sync blows up like this
/home/terotil/src/sup/lib/sup/xapian_index.rb:446:in `replace_document': InvalidArgumentError: Term too long (> 245): Lfwd: =?iso-8859-1?q?tekij=e4n_oikeudet=5d?= (ArgumentError) x-enigmail-version: 0.92.0.0 content-type: multipart/mixed; boundary="------------010606010007070802040301" x-virus-scanned: amavisd-new at cc.jyu.fi x-spam-status: no, hits=-2.373 required=5 tests=[awl=0.226, bayes_00=-2.599 from /home/terotil/src/sup/lib/sup/xapian_index.rb:446:in `sync_message' from /usr/lib/ruby/1.8/monitor.rb:242:in `synchronize' from /home/terotil/src/sup/lib/sup/xapian_index.rb:363:in `synchronize' from /home/terotil/src/sup/lib/sup/xapian_index.rb:440:in `sync_message' from /home/terotil/src/sup/lib/sup/xapian_index.rb:92:in `add_message' from /home/terotil/src/sup/bin/sup-sync:211 ... Relevant part of the problematic mail looks like this User-Agent: Debian Thunderbird 1.0.6 (X11/20050802) X-Accept-Language: en-us, en MIME-Version: 1.0 To: mutikainen@... Subject: [Fwd: =?ISO-8859-1?Q?tekij=E4n_oikeudet=5D?= X-Enigmail-Version: 0.92.0.0 Content-Type: multipart/mixed; boundary="------------010606010007070802040301" X-Virus-Scanned: amavisd-new at cc.jyu.fi X-Spam-Status: No, hits=-2.373 required=5 tests=[AWL=0.226, BAYES_00=-2.599] X-Spam-Level: X-Sorted: Whitelist Content-Length: 11892 This is how I solved it for me, for now diff --git a/lib/sup/xapian_index.rb b/lib/sup/xapian_index.rb index ad45b0e..d3b3e25 100644 --- a/lib/sup/xapian_index.rb +++ b/lib/sup/xapian_index.rb @@ -443,7 +443,11 @@ EOS warn "docid underflow, dropping #{m.id.inspect}" return end - @xapian.replace_document docid, doc + begin + @xapian.replace_document docid, doc + rescue StandardError => err + warn "Failed to add message #{m.id.inspect} to Xapian index: #{err}" + end end m.labels.each { |l| LabelManager << l } Looks like lib/sup/xapian_index.rb tries to override Xapian::Document#add_term with a version which is wired to ditch too long terms. Only that you can't override methods just by including a module. Methods of the including class override methods in included module. terotil@sotka:~$ irb > class Foo; def bar; :bar; end; end => nil > module Baz; def bar; :baz; end; end => nil > class Foo; include Baz; end => Foo > Foo.new.bar => :bar > Foo.ancestors => [Foo, Baz, Object, Kernel] # Foo before Baz, methods in Foo take priority It is still Foo#bar being called, not Baz#bar. You need to open up Xapian::Document and then do alias method chaining to override methods. Or you could do tricks like http://coderrr.wordpress.com/2008/10/29/secure-alias-method-chaining/ -- Tero Tilus ## 050 3635 235 ## http://tero.tilus.net/ _______________________________________________ sup-talk mailing list sup-talk@... http://rubyforge.org/mailman/listinfo/sup-talk |
|
|
Re: Xapian: Term too longReformatted excerpts from Tero Tilus's message of 2009-10-12:
> Looks like lib/sup/xapian_index.rb tries to override > Xapian::Document#add_term with a version which is wired to ditch too > long terms. Only that you can't override methods just by including a > module. Methods of the including class override methods in included > module. Very good point. Thanks! -- William <wmorgan-sup@...> _______________________________________________ sup-talk mailing list sup-talk@... http://rubyforge.org/mailman/listinfo/sup-talk |
|
|
[PATCH] xapian: replace DocumentMethods module with plain monkeypatching---
lib/sup/xapian_index.rb | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-) diff --git a/lib/sup/xapian_index.rb b/lib/sup/xapian_index.rb index e1cfe65..c373c17 100644 --- a/lib/sup/xapian_index.rb +++ b/lib/sup/xapian_index.rb @@ -560,7 +560,32 @@ EOS raise "Invalid term type #{type}" end end +end end +class Xapian::Document + def entry + Marshal.load data + end + + def entry=(x) + self.data = Marshal.dump x + end + + def index_text text, prefix, weight=1 + term_generator = Xapian::TermGenerator.new + term_generator.stemmer = Xapian::Stem.new(Redwood::XapianIndex::STEM_LANGUAGE) + term_generator.document = self + term_generator.index_text text, weight, prefix + end + + alias old_add_term add_term + def add_term term + if term.length <= Redwood::XapianIndex::MAX_TERM_LENGTH + old_add_term term + else + warn "dropping excessively long term #{term}" + end + end end -- 1.6.4.2 _______________________________________________ sup-talk mailing list sup-talk@... http://rubyforge.org/mailman/listinfo/sup-talk |
|
|
Re: [PATCH] xapian: replace DocumentMethods module with plain monkeypatchingDisregard this one. (I thought master had already gotten my
update-message-state patch) Excerpts from Rich Lane's message of Tue Oct 20 01:34:37 -0400 2009: > --- > lib/sup/xapian_index.rb | 25 +++++++++++++++++++++++++ > 1 files changed, 25 insertions(+), 0 deletions(-) > > diff --git a/lib/sup/xapian_index.rb b/lib/sup/xapian_index.rb > index e1cfe65..c373c17 100644 > --- a/lib/sup/xapian_index.rb > +++ b/lib/sup/xapian_index.rb > @@ -560,7 +560,32 @@ EOS > raise "Invalid term type #{type}" > end > end > +end > > end > > +class Xapian::Document > + def entry > + Marshal.load data > + end > + > + def entry=(x) > + self.data = Marshal.dump x > + end > + > + def index_text text, prefix, weight=1 > + term_generator = Xapian::TermGenerator.new > + term_generator.stemmer = > Xapian::Stem.new(Redwood::XapianIndex::STEM_LANGUAGE) > + term_generator.document = self > + term_generator.index_text text, weight, prefix > + end > + > + alias old_add_term add_term > + def add_term term > + if term.length <= Redwood::XapianIndex::MAX_TERM_LENGTH > + old_add_term term > + else > + warn "dropping excessively long term #{term}" > + end > + end > end sup-talk mailing list sup-talk@... http://rubyforge.org/mailman/listinfo/sup-talk |
|
|
[PATCH] xapian: replace DocumentMethods module with plain monkeypatching---
lib/sup/xapian_index.rb | 47 ++++++++++++++++++++++------------------------- 1 files changed, 22 insertions(+), 25 deletions(-) diff --git a/lib/sup/xapian_index.rb b/lib/sup/xapian_index.rb index ad45b0e..34d67d5 100644 --- a/lib/sup/xapian_index.rb +++ b/lib/sup/xapian_index.rb @@ -565,35 +565,32 @@ EOS raise "Invalid term type #{type}" end end +end - module DocumentMethods - def entry - Marshal.load data - end - - def entry=(x) - self.data = Marshal.dump x - end +end - def index_text text, prefix, weight=1 - term_generator = Xapian::TermGenerator.new - term_generator.stemmer = Xapian::Stem.new(STEM_LANGUAGE) - term_generator.document = self - term_generator.index_text text, weight, prefix - end +class Xapian::Document + def entry + Marshal.load data + end - def add_term term - if term.length <= MAX_TERM_LENGTH - super term - else - warn "dropping excessively long term #{term}" - end - end + def entry=(x) + self.data = Marshal.dump x end -end -end + def index_text text, prefix, weight=1 + term_generator = Xapian::TermGenerator.new + term_generator.stemmer = Xapian::Stem.new(Redwood::XapianIndex::STEM_LANGUAGE) + term_generator.document = self + term_generator.index_text text, weight, prefix + end -class Xapian::Document - include Redwood::XapianIndex::DocumentMethods + alias old_add_term add_term + def add_term term + if term.length <= Redwood::XapianIndex::MAX_TERM_LENGTH + old_add_term term + else + warn "dropping excessively long term #{term}" + end + end end -- 1.6.4.2 _______________________________________________ sup-talk mailing list sup-talk@... http://rubyforge.org/mailman/listinfo/sup-talk |
|
|
Re: [PATCH] xapian: replace DocumentMethods module with plain monkeypatchingBranch xapian-bugfix, merged into next. Thanks!
-- William <wmorgan-sup@...> _______________________________________________ sup-talk mailing list sup-talk@... http://rubyforge.org/mailman/listinfo/sup-talk |
| Free embeddable forum powered by Nabble | Forum Help |