URIResolver in combination with filenames with unicode characters fails - Problem in AbstractImageSessionContext#toFile?

View: New views
9 Messages — Rating Filter:   Alert me  

URIResolver in combination with filenames with unicode characters fails - Problem in AbstractImageSessionContext#toFile?

by Peter Coppens :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Fop Fans,

Attached you will find 2 fo files
- t1.fo
- t2.fo

and two image files
- hexley.png
- héxlæ.png

The same image actually with a different name. The second has a name with unicode characters in (not sure it will make it on the list using nabble)

The issue I am struggling with (illustrated in attached java program FopUriResolver.java)  is that when a URIResolver is used to find the image files, FOP (I assume the image library) refuses to load the image when it contains unicode characters in its name. If I comment out the uri resolver it works just fine.


So, with the url resolver enabled I get

0    [main] DEBUG org.apache.fop.apps.FOUserAgent(415) - target-resolution set to: 72.0dpi (px2mm=0.35277778)
resolve on hexley.png canRead returns true
1418 [main] DEBUG org.apache.fop.apps.FOUserAgent(415) - target-resolution set to: 72.0dpi (px2mm=0.35277778)
resolve on héxlæ.png canRead returns true
1426 [main] ERROR org.apache.xmlgraphics.image.loader.impl.AbstractImageSessionContext(104) - Error while opening file. Could not load image from system identifier 'file:/eb2/trunk/playground/h%C3%A9xl%C3%A6.png' (/eb2/trunk/playground/héxl�.png (No such file or directory))
1426 [main] ERROR org.apache.fop.fo.flow.ExternalGraphic(83) - Image not found: héxlæ.png

and t2.pdf does not have the image embedded

With the uri resolver disabled it returns

    [main] DEBUG org.apache.fop.apps.FOUserAgent(415) - target-resolution set to: 72.0dpi (px2mm=0.35277778)
1384 [main] DEBUG org.apache.fop.apps.FOUserAgent(415) - target-resolution set to: 72.0dpi (px2mm=0.35277778)

and both pdfs are ok


I think the problem might lie in AbstractImageSessionContext#toFile where unicode escapes are apparently not taken into account.


It does not reproduce with JDK 1.5 only with JDK1.6.

I would sure appreciate any tips on how to fix or workaround this issue

Many thanks indeed

Peter

Re: URIResolver in combination with filenames with unicode characters fails - Problem in AbstractImageSessionContext#toFile?

by Andreas Delmelle-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 27 Jun 2009, at 21:19, Peter Coppens wrote:

Hi Peter

Sorry for the rather late reply. Busy weekend...

<snip />

> and two image files
> -  http://www.nabble.com/file/p24235918/hexley.png hexley.png
> -  http://www.nabble.com/file/p24235918/h%25C3%25A9xl%25C3%25A6.png
> héxlæ.png
> <snip />
> 1426 [main] ERROR
> org
> .apache
> .xmlgraphics.image.loader.impl.AbstractImageSessionContext(104) -
> Error while opening file. Could not load image from system identifier
> 'file:/eb2/trunk/playground/h%C3%A9xl%C3%A6.png'

<snip />
> I think the problem might lie in AbstractImageSessionContext#toFile  
> where
> unicode escapes are apparently not taken into account.

But they are. That's what the loop starting at line 252 is meant for.

Weird thing... I'm suspecting some form of double-escaping is taking  
place.

I get the following output with Java 6 on Mac OS X, using only ERROR-
level output:

resolve on héxlæ.png canRead returns false
Jun 29, 2009 9:38:16 AM  
org.apache.xmlgraphics.image.loader.impl.AbstractImageSessionContext  
newSource
SEVERE: Unable to obtain stream from system identifier 'file:/
Developer/javatools/xml-fop-trunk/h%C3%A9xl%C3%A6.png'
Jun 29, 2009 9:38:16 AM  
org.apache.xmlgraphics.image.loader.impl.AbstractImageSessionContext  
newSource
SEVERE: The Source that was returned from URI resolution didn't  
contain an InputStream for URI: héxlæ.png
Jun 29, 2009 9:38:16 AM org.apache.fop.events.LoggingEventListener  
processEvent
SEVERE: Image not found. URI: héxlæ.png. (See position 14:110)

Changing to create the StreamSource using the String-constructor (as  
opposed to File), the output becomes:

resolve on héxlæ.png canRead returns false
Jun 29, 2009 10:10:48 AM  
org.apache.xmlgraphics.image.loader.impl.AbstractImageSessionContext  
newSource
SEVERE: The Source that was returned from URI resolution didn't  
contain an InputStream for URI: héxlæ.png
Jun 29, 2009 10:10:48 AM org.apache.fop.events.LoggingEventListener  
processEvent
SEVERE: Image not found. URI: héxlæ.png. (See position 14:110)

Changing the FO to use an absolute file:-URL as the src attribute of  
the fo:external-graphic, I still get an error when using the File-
constructor, but when using the String-constructor, it produces a  
correct PDF...
[Note that, strictly speaking, the XSL-FO Recommendation does require  
the src property to be specified as a URI. See: http://www.w3.org/TR/xsl/#src 
]

All I can say for the moment: under investigation.


Regards

Andreas
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Re: URIResolver in combination with filenames with unicode characters fails - Problem in AbstractImageSessionContext#toFile?

by Peter Coppens :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks Andreas.

Line nrs  in AbstractImageSessionContext do not match with what I see but I am not using trunk.

What I found out debugging is that the A9C3 character is escaped as %A9%C3 when the url is contructed but that in the toFile method %A9 and %C3 are interpreted as 2 different characters.

I am also confused on why you get

        "resolve on héxlæ.png canRead returns false"

Perhaps your machine/filesystem is not using utf8 and/or saving the file did not use utf8 or something like that

I don't mind rewriting the URI resolver in whatever way to make it work bit FOP. For now I have something very ugly hacked in, but at least that is working

It looks like

class MyStreamSource extends StreamSource {
  private File file;

  public MyStreamSource(File f) {
    super(f);
    this.file=f;
  }
 
  @Override
  public String getSystemId() {
    return this.file.getAbsolutePath();
  }
 
  @Override public InputStream getInputStream() {
    try {
      return new FileInputStream(file);
    } catch (FileNotFoundException e) {
      return null;
    }
  }
 
}


Any other advice more than welcome.

Thanks,

Peter

Re: URIResolver in combination with filenames with unicode characters fails - Problem in AbstractImageSessionContext#toFile?

by Andreas Delmelle-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 30 Jun 2009, at 11:10, Peter Coppens wrote:

Hi Peter

> Line nrs  in AbstractImageSessionContext do not match with what I  
> see but I
> am not using trunk.
>
> What I found out debugging is that the A9C3 character is escaped as  
> %A9%C3
> when the url is contructed but that in the toFile method %A9 and %C3  
> are
> interpreted as 2 different characters.

Indeed, that already looked a bit suspicious to me, and may be the  
source of the issue.
Seems more robust to use java.net.URLDecoder instead of re-inventing  
the light-bulb.

I have attached a small patch for you to try out. Would be good if you  
could confirm this fixes the issue on your end.

>
> I am also confused on why you get
>
>        "resolve on héxlæ.png canRead returns false"
>
> Perhaps your machine/filesystem is not using utf8 and/or saving the  
> file did
> not use utf8 or something like that

Oops, no. My bad. It becomes apparent when looking at the resolved URI  
why that is the case. It just goes looking in the wrong directory... :-/

Not entirely sure but this may also point to a possible cause (?)
Your sample app uses "new File(href)", but as the output shows, that  
href is nothing more than the URI as specified in the source document.  
The code will only work if the image is available in the current  
working directory.

When I use an absolute URL as the src property, I get something like:
Jun 30, 2009 11:23:51 AM  
org.apache.xmlgraphics.image.loader.impl.AbstractImageSessionContext  
newSource
SEVERE: Unable to obtain stream from system identifier 'file:/
Developer/javatools/xml-fop-trunk/file:/users/andreas/documents/xml/h
%C3%A9xl%C3%A6.png'

Even though the URL is absolute, the current working directory is  
prepended. Expected, since the File() constructor expects an abstract  
pathname, not a URI. It only works correctly if the file:- prefix is  
omitted, but then...

I run the process from FOP's base directory, so if the File  
constructor is called with a relative URI as its argument, it  
interprets that as a relative pathname. If File.toURL() is called by  
the StreamSource constructor, the expanded pathname and the derived  
URI point to a file that does not exist.

Running the process from the directory where the document and image  
reside, and using the attached patch, I got no more errors.





Regards,

Andreas Delmelle
mailto:andreas.delmelle.AT.telenet.be
jabber: mandreas@...
skype: adlm0608

---
"Everybody I know who is right always agrees with ME."




---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...

xgcommons-patch.patch (2K) Download Attachment

Re: URIResolver in combination with filenames with unicode characters fails - Problem in AbstractImageSessionContext#toFile?

by Andreas Delmelle-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 30 Jun 2009, at 12:35, Andreas Delmelle wrote:

Hello Peter

Just checking the status:

> On 30 Jun 2009, at 11:10, Peter Coppens wrote:
>> What I found out debugging is that the A9C3 character is escaped as  
>> %A9%C3
>> when the url is contructed but that in the toFile method %A9 and  
>> %C3 are
>> interpreted as 2 different characters.
>
> Indeed, that already looked a bit suspicious to me, and may be the  
> source of the issue.
> Seems more robust to use java.net.URLDecoder instead of re-inventing  
> the light-bulb.
>
> I have attached a small patch for you to try out. Would be good if  
> you could confirm this fixes the issue on your end.

Have you been able to confirm this? If not, would it help if I  
provided you with a precompiled JAR for XG Commons including the  
changes in the patch (so that you can simply do a temporary swap in  
your test environment)?


Thanks in advance!

Andreas

Andreas Delmelle
mailto:andreas.delmelle.AT.telenet.be
jabber: mandreas@...
skype: adlm0608

---


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Re: URIResolver in combination with filenames with unicode characters fails - Problem in AbstractImageSessionContext#toFile?

by Peter Coppens :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Have not checked. Still in my 'inbox'. Sorry for the delay. For now my wo
does the job so it's less urgent for me. Will get to it asap.

Peter


> From: Andreas Delmelle <andreas.delmelle@...>
> Reply-To: <fop-users@...>
> Date: Sun, 5 Jul 2009 21:07:13 +0200
> To: <fop-users@...>
> Subject: Re: URIResolver in combination with filenames with unicode characters
> fails - Problem in AbstractImageSessionContext#toFile?
>
> On 30 Jun 2009, at 12:35, Andreas Delmelle wrote:
>
> Hello Peter
>
> Just checking the status:
>
>> On 30 Jun 2009, at 11:10, Peter Coppens wrote:
>>> What I found out debugging is that the A9C3 character is escaped as
>>> %A9%C3
>>> when the url is contructed but that in the toFile method %A9 and
>>> %C3 are
>>> interpreted as 2 different characters.
>>
>> Indeed, that already looked a bit suspicious to me, and may be the
>> source of the issue.
>> Seems more robust to use java.net.URLDecoder instead of re-inventing
>> the light-bulb.
>>
>> I have attached a small patch for you to try out. Would be good if
>> you could confirm this fixes the issue on your end.
>
> Have you been able to confirm this? If not, would it help if I
> provided you with a precompiled JAR for XG Commons including the
> changes in the patch (so that you can simply do a temporary swap in
> your test environment)?
>
>
> Thanks in advance!
>
> Andreas
>
> Andreas Delmelle
> mailto:andreas.delmelle.AT.telenet.be
> jabber: mandreas@...
> skype: adlm0608
>
> ---
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@...
> For additional commands, e-mail: fop-users-help@...
>



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Re: URIResolver in combination with filenames with unicode characters fails - Problem in AbstractImageSessionContext#toFile?

by Andreas Delmelle-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 07 Jul 2009, at 04:23, Peter Coppens wrote:

> Have not checked. Still in my 'inbox'. Sorry for the delay. For now  
> my wo
> does the job so it's less urgent for me. Will get to it asap.

OK, thanks. As pointed out: if I can save you any hassles by providing  
you with a precompiled binary, just give us a yell, and mail it to you  
off-list.

Regards

Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Re: URIResolver in combination with filenames with unicode characters fails - Problem in AbstractImageSessionContext#toFile?

by Peter Coppens :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Finally checked. The patch solves the issue.

Hope this helps and thanks,

Peter


> From: Peter Coppens <pc.subscriptions@...>
> Date: Tue, 07 Jul 2009 04:23:12 +0200
> To: <fop-users@...>
> Conversation: URIResolver in combination with filenames with unicode
> characters fails - Problem in AbstractImageSessionContext#toFile?
> Subject: Re: URIResolver in combination with filenames with unicode characters
> fails - Problem in AbstractImageSessionContext#toFile?
>
> Have not checked. Still in my 'inbox'. Sorry for the delay. For now my wo does
> the job so it's less urgent for me. Will get to it asap.
>
> Peter
>
>
>> From: Andreas Delmelle <andreas.delmelle@...>
>> Reply-To: <fop-users@...>
>> Date: Sun, 5 Jul 2009 21:07:13 +0200
>> To: <fop-users@...>
>> Subject: Re: URIResolver in combination with filenames with unicode
>> characters
>> fails - Problem in AbstractImageSessionContext#toFile?
>>
>> On 30 Jun 2009, at 12:35, Andreas Delmelle wrote:
>>
>> Hello Peter
>>
>> Just checking the status:
>>
>>> On 30 Jun 2009, at 11:10, Peter Coppens wrote:
>>>> What I found out debugging is that the A9C3 character is escaped as
>>>> %A9%C3
>>>> when the url is contructed but that in the toFile method %A9 and
>>>> %C3 are
>>>> interpreted as 2 different characters.
>>>
>>> Indeed, that already looked a bit suspicious to me, and may be the
>>> source of the issue.
>>> Seems more robust to use java.net.URLDecoder instead of re-inventing
>>> the light-bulb.
>>>
>>> I have attached a small patch for you to try out. Would be good if
>>> you could confirm this fixes the issue on your end.
>>
>> Have you been able to confirm this? If not, would it help if I
>> provided you with a precompiled JAR for XG Commons including the
>> changes in the patch (so that you can simply do a temporary swap in
>> your test environment)?
>>
>>
>> Thanks in advance!
>>
>> Andreas
>>
>> Andreas Delmelle
>> mailto:andreas.delmelle.AT.telenet.be
>> jabber: mandreas@...
>> skype: adlm0608
>>
>> ---
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: fop-users-unsubscribe@...
>> For additional commands, e-mail: fop-users-help@...
>>



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Re: URIResolver in combination with filenames with unicode characters fails - Problem in AbstractImageSessionContext#toFile?

by Andreas Delmelle-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 08 Jul 2009, at 16:37, Peter Coppens wrote:

Hi Peter

> Finally checked. The patch solves the issue.

Thanks for the confirmation. I have committed the change to XG  
Commons, and will shortly commit the modified JAR to FOP Trunk.

Regards

Andreas

Andreas Delmelle
mailto:andreas.delmelle.AT.telenet.be
jabber: mandreas@...
skype: adlm0608

---


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...