[Bug 5956] New: html_test('font_face_bad') triggers if font name ends with numeric

View: New views
2 Messages — Rating Filter:   Alert me  

[Bug 5956] New: html_test('font_face_bad') triggers if font name ends with numeric

by Bugzilla from bugzilla-daemon@bugzilla.spamassassin.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5956

           Summary: html_test('font_face_bad') triggers if font name ends
                    with numeric
           Product: Spamassassin
           Version: 3.2.3
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P5
         Component: Rules (Eval Tests)
        AssignedTo: dev@...
        ReportedBy: lee@...


Vox.com - a website that sends out private messages encoded in html in both
latin and Japanese scripts contains the following fragment as part of its
template:

----
<font style="font: normal 12px arial, helvetica, hirakakupro-w3, osaka,
sans-serif;"
    face="arial, helvetica, hirakakupro-w3, osaka, sans-serif" size="2">
----

The appearance of the "3" character in the face list triggers
HTML_FONT_FACE_BAD.

It appears that font names ending with a w3, w8, etc (w meaning weight) are not
uncommon for kana fonts.


--
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 5956] html_test('font_face_bad') triggers if font name ends with numeric

by Bugzilla from bugzilla-daemon@bugzilla.spamassassin.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5956





--- Comment #1 from Cedric Knight <cedric@...>  2009-07-03 12:23:36 PST ---
Comment: this rule has a poor ratio anyway.  Quoting a June report:  
HTML_FONT_FACE_BAD:  bad, avg S/O=0.70 avg Spam%=0.36 avg Ham%=0.15
but 70% of the hits I get are ham.  

Some typeface names include spaces within them, some oriental ones look like
face=3D=CB=CE=CC=E5

and then there are mistakes in ham like:
face=3D"arial, helvetica, sans-serif;"
face=3D"#6f6f6f"

If the rule is refined, maybe it should include these cases.  The HTML 4.0 spec
http://www.w3.org/TR/html4/present/graphics.html#h-15.2.2 seems to say that any
character in the document character set can be used, not just [a-z], but spaces
need to be trimmed.

--
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.