[jira] Created: (TIKA-292) PDFBox is too verbose

View: New views
2 Messages — Rating Filter:   Alert me  

[jira] Created: (TIKA-292) PDFBox is too verbose

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

PDFBox is too verbose
---------------------

                 Key: TIKA-292
                 URL: https://issues.apache.org/jira/browse/TIKA-292
             Project: Tika
          Issue Type: Improvement
          Components: parser
            Reporter: Jukka Zitting
            Priority: Minor


PDFBox 0.8 logs INFO messages for all PDF primitives that are not enabled in the respective PDFBox configuration. Many of these primitives are explicitly not needed for text extraction, so there's no point in logging so much about them.

Until this is fixed in PDFBox, we should work around it in Tika.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (TIKA-292) PDFBox is too verbose

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ https://issues.apache.org/jira/browse/TIKA-292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved TIKA-292.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.5
         Assignee: Jukka Zitting

Fixed in revision 819503.

> PDFBox is too verbose
> ---------------------
>
>                 Key: TIKA-292
>                 URL: https://issues.apache.org/jira/browse/TIKA-292
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 0.5
>
>
> PDFBox 0.8 logs INFO messages for all PDF primitives that are not enabled in the respective PDFBox configuration. Many of these primitives are explicitly not needed for text extraction, so there's no point in logging so much about them.
> Until this is fixed in PDFBox, we should work around it in Tika.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.