[jira] Created: (TIKA-319) HtmlParser - use encoding hint only if charset is supported

View: New views
2 Messages — Rating Filter:   Alert me  

[jira] Created: (TIKA-319) HtmlParser - use encoding hint only if charset is supported

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

HtmlParser - use encoding hint only if charset is supported
-----------------------------------------------------------

                 Key: TIKA-319
                 URL: https://issues.apache.org/jira/browse/TIKA-319
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 0.4
            Reporter: Piotr B.


Encoding hint should be considered only if that encoding is supported.

Diff of my fix:

--- HtmlParser.java (wersja 835302)
+++ HtmlParser.java (kopia robocza)
@@ -46,7 +46,7 @@
         // Prepare the input source using the encoding hint if available
         InputSource source = new InputSource(stream);
         String encoding = metadata.get(Metadata.CONTENT_ENCODING);
-        if (encoding != null) {
+        if (encoding != null && Charset.isSupported(encoding)) {
             source.setEncoding(encoding);
         }


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (TIKA-319) HtmlParser - use encoding hint only if charset is supported

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ https://issues.apache.org/jira/browse/TIKA-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved TIKA-319.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.5
         Assignee: Jukka Zitting

Good point! Fixed as suggested in revision 835726.

> HtmlParser - use encoding hint only if charset is supported
> -----------------------------------------------------------
>
>                 Key: TIKA-319
>                 URL: https://issues.apache.org/jira/browse/TIKA-319
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.4
>            Reporter: Piotr B.
>            Assignee: Jukka Zitting
>             Fix For: 0.5
>
>
> Encoding hint should be considered only if that encoding is supported.
> Diff of my fix:
> --- HtmlParser.java (wersja 835302)
> +++ HtmlParser.java (kopia robocza)
> @@ -46,7 +46,7 @@
>          // Prepare the input source using the encoding hint if available
>          InputSource source = new InputSource(stream);
>          String encoding = metadata.get(Metadata.CONTENT_ENCODING);
> -        if (encoding != null) {
> +        if (encoding != null && Charset.isSupported(encoding)) {
>              source.setEncoding(encoding);
>          }

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.