On Fri, Apr 13, 2012 at 11:58 AM, Ashvin Narayanan
<ashvin@...> wrote:
Thank you for the reply.
I am not using TextCat so didn't load the plugin as you suggest. The
reason is that the following thread:
http://www.mail-archive.com/users@.../msg69225.html
leads me to believe that TextCat might not be suitable for my needs, since
(as quoted within that thread):
"Textcat is not designed to decide what language the email is, but to
find a set of languages it *might* be. It is very prone to declaring
extra languages that are not really present due to it's design"
Thats true. There should not be second opinion.
Here is a another quote from the thread:
http://old.nabble.com/Problems-with-Cyrillic-spam-td32978897.html#a32981171
It reads:
"ok_locales functions by identifying character sets that can only be used
for a specific language. UTF8, Windows-1255, and koi8 are not such
character sets, because they can also be used to write in English."
So, even though 'Default transliteration language' is set to Arabic in
Gmail, could the character set used still be one of UTF8, Windows-1255 or
koi8? In that case, ok_locales would fail correct?
For me, ok_locales not working at all even if textcat plugin is enabled . But with ok_languages, following rule gets applied from plugin textcat.(for arabic language also)
1.5 BODY_8BITS BODY: Body includes 8 consecutive 8-bit characters
The other rule (UNWANTED_LANGUAGE_BODY) applies sometimes only.
But this plugin is definitely a good one for language based mail filtering.
Regards,
Swati R