|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
[jira] Created: (TIKA-255) Embedded Visio Content Crashes PPT ParserEmbedded Visio Content Crashes PPT Parser
----------------------------------------- Key: TIKA-255 URL: https://issues.apache.org/jira/browse/TIKA-255 Project: Tika Issue Type: Bug Components: parser Affects Versions: 0.4 Environment: Debian 5.0.1 Reporter: David Weekly The attached PPT is a valid file but crashes Tika. It contains embedded Visio data, which may be the cause for the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (TIKA-255) Embedded Visio Content Crashes PPT Parser[ https://issues.apache.org/jira/browse/TIKA-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Weekly updated TIKA-255: ------------------------------ Attachment: extract-tika.ppt This PPT file is valid but crashes Tika 0.4 nightly: @sfx22001:~/tika-reactor# java -jar tika-app/target/tika-app-0.4-SNAPSHOT.jar /home/dew/extract-tika.ppt Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@61c80b01 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:85) at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:116) at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:57) Caused by: java.lang.NullPointerException at org.apache.poi.hslf.model.SimpleShape.getClientRecords(SimpleShape.java:322) at org.apache.poi.hslf.model.SimpleShape.getClientDataRecord(SimpleShape.java:307) at org.apache.poi.hslf.model.TextShape.getPlaceholderAtom(TextShape.java:547) at org.apache.poi.hslf.model.Sheet.getPlaceholder(Sheet.java:408) at org.apache.poi.hslf.model.HeadersFooters.isVisible(HeadersFooters.java:244) at org.apache.poi.hslf.model.HeadersFooters.isHeaderVisible(HeadersFooters.java:148) at org.apache.poi.hslf.extractor.PowerPointExtractor.getText(PowerPointExtractor.java:173) at org.apache.poi.hslf.extractor.PowerPointExtractor.getText(PowerPointExtractor.java:162) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:88) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:119) ... 3 more > Embedded Visio Content Crashes PPT Parser > ----------------------------------------- > > Key: TIKA-255 > URL: https://issues.apache.org/jira/browse/TIKA-255 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.4 > Environment: Debian 5.0.1 > Reporter: David Weekly > Attachments: extract-tika.ppt > > > The attached PPT is a valid file but crashes Tika. It contains embedded Visio data, which may be the cause for the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (TIKA-255) Embedded Visio Content Crashes PPT Parser[ https://issues.apache.org/jira/browse/TIKA-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724723#action_12724723 ] David Weekly commented on TIKA-255: ----------------------------------- I note here https://issues.apache.org/bugzilla/show_bug.cgi?id=47068 which crashes in the same place. It is claimed that POI @746238 fixes this issue (comitted Feb 20, 2009) - http://svn.apache.org/viewvc?view=rev&revision=746238 When will this show up in Tika? > Embedded Visio Content Crashes PPT Parser > ----------------------------------------- > > Key: TIKA-255 > URL: https://issues.apache.org/jira/browse/TIKA-255 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.4 > Environment: Debian 5.0.1 > Reporter: David Weekly > Attachments: extract-tika.ppt > > > The attached PPT is a valid file but crashes Tika. It contains embedded Visio data, which may be the cause for the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (TIKA-255) Embedded Visio Content Crashes PPT Parser[ https://issues.apache.org/jira/browse/TIKA-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724728#action_12724728 ] David Weekly commented on TIKA-255: ----------------------------------- Note that the following patch to trunk resolves this issue. Please commit! --- tika-parsers/pom.xml~ 2009-06-26 20:40:53.352092861 +0000 +++ tika-parsers/pom.xml 2009-06-26 21:34:41.380840576 +0000 @@ -38,7 +38,7 @@ <url>http://lucene.apache.org/tika/</url> <properties> - <poi.version>3.5-beta5</poi.version> + <poi.version>3.5-beta6</poi.version> </properties> > Embedded Visio Content Crashes PPT Parser > ----------------------------------------- > > Key: TIKA-255 > URL: https://issues.apache.org/jira/browse/TIKA-255 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.4 > Environment: Debian 5.0.1 > Reporter: David Weekly > Attachments: extract-tika.ppt > > > The attached PPT is a valid file but crashes Tika. It contains embedded Visio data, which may be the cause for the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Resolved: (TIKA-255) Embedded Visio Content Crashes PPT Parser[ https://issues.apache.org/jira/browse/TIKA-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jukka Zitting resolved TIKA-255. -------------------------------- Resolution: Fixed Fix Version/s: 0.4 Assignee: Jukka Zitting Thanks for the report and the suggested fix! POI dependency upgraded in revision 789089. > Embedded Visio Content Crashes PPT Parser > ----------------------------------------- > > Key: TIKA-255 > URL: https://issues.apache.org/jira/browse/TIKA-255 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.4 > Environment: Debian 5.0.1 > Reporter: David Weekly > Assignee: Jukka Zitting > Fix For: 0.4 > > Attachments: extract-tika.ppt > > > The attached PPT is a valid file but crashes Tika. It contains embedded Visio data, which may be the cause for the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
| Free embeddable forum powered by Nabble | Forum Help |