On 12/15, Martin Gregorie wrote:
> In that case I'm missing some information: how to write a rule that can
> interpret the value(s) returned by TextCat.
I think you're looking for:
ok_languages en fr de
-
http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Plugin_TextCat.html> Why wouldn't it be sensible to rewrite ok_locales to compare TextCat
> return value(s) against its list of OK codes?
Because that functionality already exists within TextCat?
> Then why has ok_locales not been fixed already? This is not a criticism,
> just a request for information. Is it something that's difficult to do
> efficiently? I'd imagine that language recognition by looking codepoint
> values is possible but not necessarily fast nor unambiguous.
Because it's not actually broken. That bug should probably be closed.
Perhaps after noting the limited utility in the documentation.
ok_locales functions by identifying character sets that can only be used
for a specific language. UTF8, Windows-1255, and koi8 are not such
character sets, because they can also be used to write in English.
And, most importantly, as Kevin says here, people *do* use those character
sets to write in English:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4078#c27Well, it's obvious that people write English in UTF8.
> I've no time ATM and in any case I'm a middling to poor Perl coder. Now,
> if SA was written in C or Java....
I bet you know that's the best way to get better at a language.
--
"If you are not paranoid... you may not be paying attention."
-
jimh@..., on an IDPA mailing list
http://www.ChaosReigns.com