|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
[jira] Created: (TIKA-316) Parsing Visio diagrams with tika-app causes TikaException (Found a chunk with a negative length)Parsing Visio diagrams with tika-app causes TikaException (Found a chunk with a negative length)
------------------------------------------------------------------------------------------------ Key: TIKA-316 URL: https://issues.apache.org/jira/browse/TIKA-316 Project: Tika Issue Type: Bug Affects Versions: 0.4, 0.5 Environment: Windows Server 2003 SP2, JRE 1.6.0_16, tika-app, Visio 2003 Reporter: Mike Hays tika-app (0.4 and 0.5 nightly) return the following when attempting to parse a Visio 2003 file (other versions may be affected): Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@145e044 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:123) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:103) at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:176) at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:63) Caused by: java.lang.IllegalArgumentException: Found a chunk with a negative length, which isn't allowed at org.apache.poi.hdgf.chunks.ChunkFactory.createChunk(ChunkFactory.java:120) at org.apache.poi.hdgf.streams.ChunkStream.findChunks(ChunkStream.java:59) at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:93) at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100) at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100) at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:95) at org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:52) at org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:49) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:118) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121) ... 3 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (TIKA-316) Parsing Visio diagrams with tika-app causes TikaException (Found a chunk with a negative length)[ https://issues.apache.org/jira/browse/TIKA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Hays updated TIKA-316: --------------------------- Attachment: repro-TIKA-316.vsd Run either: java -jar tika-app-0.4.jar repro-TIKA-316.vsd java -jar tika-app-0.5-SNAPSHOT.jar repro-TIKA-316.vsd > Parsing Visio diagrams with tika-app causes TikaException (Found a chunk with a negative length) > ------------------------------------------------------------------------------------------------ > > Key: TIKA-316 > URL: https://issues.apache.org/jira/browse/TIKA-316 > Project: Tika > Issue Type: Bug > Affects Versions: 0.4, 0.5 > Environment: Windows Server 2003 SP2, JRE 1.6.0_16, tika-app, Visio 2003 > Reporter: Mike Hays > Attachments: repro-TIKA-316.vsd > > > tika-app (0.4 and 0.5 nightly) return the following when attempting to parse a Visio 2003 file (other versions may be affected): > Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@145e044 > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:123) > at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:103) > at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:176) > at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:63) > Caused by: java.lang.IllegalArgumentException: Found a chunk with a negative length, which isn't allowed > at org.apache.poi.hdgf.chunks.ChunkFactory.createChunk(ChunkFactory.java:120) > at org.apache.poi.hdgf.streams.ChunkStream.findChunks(ChunkStream.java:59) > at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:93) > at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100) > at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100) > at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:95) > at org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:52) > at org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:49) > at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:118) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121) > ... 3 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (TIKA-316) Parsing Visio diagrams with tika-app causes TikaException (Found a chunk with a negative length)[ https://issues.apache.org/jira/browse/TIKA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-316: ----------------------------------- Component/s: cli - set fix component > Parsing Visio diagrams with tika-app causes TikaException (Found a chunk with a negative length) > ------------------------------------------------------------------------------------------------ > > Key: TIKA-316 > URL: https://issues.apache.org/jira/browse/TIKA-316 > Project: Tika > Issue Type: Bug > Components: cli > Affects Versions: 0.4, 0.5 > Environment: Windows Server 2003 SP2, JRE 1.6.0_16, tika-app, Visio 2003 > Reporter: Mike Hays > Attachments: repro-TIKA-316.vsd > > > tika-app (0.4 and 0.5 nightly) return the following when attempting to parse a Visio 2003 file (other versions may be affected): > Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@145e044 > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:123) > at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:103) > at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:176) > at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:63) > Caused by: java.lang.IllegalArgumentException: Found a chunk with a negative length, which isn't allowed > at org.apache.poi.hdgf.chunks.ChunkFactory.createChunk(ChunkFactory.java:120) > at org.apache.poi.hdgf.streams.ChunkStream.findChunks(ChunkStream.java:59) > at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:93) > at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100) > at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100) > at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:95) > at org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:52) > at org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:49) > at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:118) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121) > ... 3 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (TIKA-316) Parsing Visio diagrams with tika-app causes TikaException (Found a chunk with a negative length)[ https://issues.apache.org/jira/browse/TIKA-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jukka Zitting updated TIKA-316: ------------------------------- Component/s: (was: cli) parser Looks like this is caused by some underlying POI issue, i.e. the HDGF code in POI fails to interpret this file correctly. It would be great if someone could report this issue upstream to POI and add a reference to that issue here. > Parsing Visio diagrams with tika-app causes TikaException (Found a chunk with a negative length) > ------------------------------------------------------------------------------------------------ > > Key: TIKA-316 > URL: https://issues.apache.org/jira/browse/TIKA-316 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.4, 0.5 > Environment: Windows Server 2003 SP2, JRE 1.6.0_16, tika-app, Visio 2003 > Reporter: Mike Hays > Attachments: repro-TIKA-316.vsd > > > tika-app (0.4 and 0.5 nightly) return the following when attempting to parse a Visio 2003 file (other versions may be affected): > Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@145e044 > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:123) > at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:103) > at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:176) > at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:63) > Caused by: java.lang.IllegalArgumentException: Found a chunk with a negative length, which isn't allowed > at org.apache.poi.hdgf.chunks.ChunkFactory.createChunk(ChunkFactory.java:120) > at org.apache.poi.hdgf.streams.ChunkStream.findChunks(ChunkStream.java:59) > at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:93) > at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100) > at org.apache.poi.hdgf.streams.PointerContainingStream.findChildren(PointerContainingStream.java:100) > at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:95) > at org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:52) > at org.apache.poi.hdgf.extractor.VisioTextExtractor.<init>(VisioTextExtractor.java:49) > at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:118) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121) > ... 3 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
| Free embeddable forum powered by Nabble | Forum Help |