Catalyst and UTF8 Chained URLs

View: New views
5 Messages — Rating Filter:   Alert me  

Catalyst and UTF8 Chained URLs

by Rod Taylor-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have a URL which includes UTF8 components which are in the chained (CaptureArgs) position. The escape mechanism is supposed encode each byte of the UTF8 sequence individually when creating the URL (which Catalyst seems to do) and reverse this on the way in.

The below adjustment appears to fix the issue I'm having with URL arguments not being decoded on the way in.

This may not be the right place to handle it as it could break older applications which work around arguments not being decoded properly to UTF8.

This makes a good test URL.

$uri = $c->uri_for( $cont->action_for($action), ['VÜ Living', '他们有理性和良心' ]);

Should result in VÜ Living and 他们有理性和良心 being in the first and second captured arguments.


*** Chained.pm.orig     Tue Sep  8 21:10:10 2009
--- Chained.pm  Tue Sep  8 21:35:38 2009
***************
*** 168,174 ****
 
      $request->action("/${action}");
      $request->match("/${action}");
!     $request->captures($captures);
      $c->action($action);
      $c->namespace( $action->namespace );
 
--- 168,187 ----
 
      $request->action("/${action}");
      $request->match("/${action}");
!
!     # Decode Captures
!     my $decodedCaptures = [];
!     if ($captures && @$captures) {
!       for my $arg (@{$captures}) {
!             $arg =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
!             utf8::downgrade($arg);
!             utf8::decode($arg);
!             push(@{$decodedCaptures}, $arg);
!       }
!     }
!     $request->captures($decodedCaptures);
      $c->action($action);
      $c->namespace( $action->namespace );

_______________________________________________
List: Catalyst@...
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@.../
Dev site: http://dev.catalyst.perl.org/

Re: Catalyst and UTF8 Chained URLs

by Kieren Diment-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


I think you need to write a  test for the behaviour to get that  
patched into the core.  Also you're probably better to send the patch  
to Catalyst::Runtime's RT queue: http://rt.cpan.org/Public/Bug/Report.html?Queue=Catalyst-Runtime


On 09/09/2009, at 11:45 AM, Rod Taylor wrote:

> I have a URL which includes UTF8 components which are in the chained
> (CaptureArgs) position. The escape mechanism is supposed encode each  
> byte of
> the UTF8 sequence individually when creating the URL (which Catalyst  
> seems
> to do) and reverse this on the way in.
>
> The below adjustment appears to fix the issue I'm having with URL  
> arguments
> not being decoded on the way in.
>  [snip]

_______________________________________________
List: Catalyst@...
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@.../
Dev site: http://dev.catalyst.perl.org/

Re: Catalyst and UTF8 Chained URLs

by Jon Schutz-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

How would this patch affect systems that choose to encode their URLs in
something other than UTF-8? (Other character encodings are widely used,
particularly on Asian sites).

There might be a case for having the encoding type as a configurable
option (where one option is no decoding so the application can handle
it) - otherwise it has to be left to the application where the app
designers can reverse whatever encoding they have chosen.

--
Jon Schutz                        My tech notes http://notes.jschutz.net
Chief Technology Officer                        http://www.youramigo.com
YourAmigo




On 09/09/2009 11:15 AM, Rod Taylor wrote:

> I have a URL which includes UTF8 components which are in the chained
> (CaptureArgs) position. The escape mechanism is supposed encode each
> byte of the UTF8 sequence individually when creating the URL (which
> Catalyst seems to do) and reverse this on the way in.
>
> The below adjustment appears to fix the issue I'm having with URL
> arguments not being decoded on the way in.
>
> This may not be the right place to handle it as it could break older
> applications which work around arguments not being decoded properly to UTF8.
>
> This makes a good test URL.
>
> $uri = $c->uri_for( $cont->action_for($action), ['VÜ Living', '他们有理
> 性和良心' ]);
>
> Should result in VÜ Living and 他们有理性和良心 being in the first and
> second captured arguments.
>
>
> *** Chained.pm.orig     Tue Sep  8 21:10:10 2009
> --- Chained.pm  Tue Sep  8 21:35:38 2009
> ***************
> *** 168,174 ****
>  
>       $request->action("/${action}");
>       $request->match("/${action}");
> !     $request->captures($captures);
>       $c->action($action);
>       $c->namespace( $action->namespace );
>  
> --- 168,187 ----
>  
>       $request->action("/${action}");
>       $request->match("/${action}");
> !
> !     # Decode Captures
> !     my $decodedCaptures = [];
> !     if ($captures && @$captures) {
> !       for my $arg (@{$captures}) {
> !             $arg =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
> !             utf8::downgrade($arg);
> !             utf8::decode($arg);
> !             push(@{$decodedCaptures}, $arg);
> !       }
> !     }
> !     $request->captures($decodedCaptures);
>       $c->action($action);
>       $c->namespace( $action->namespace );
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> List: Catalyst@...
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst@.../
> Dev site: http://dev.catalyst.perl.org/

_______________________________________________
List: Catalyst@...
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@.../
Dev site: http://dev.catalyst.perl.org/

Re: Catalyst and UTF8 Chained URLs

by Octavian Râşniţă :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

From: "Jon Schutz" <jon+catalyst@...>

How would this patch affect systems that choose to encode their URLs in
something other than UTF-8? (Other character encodings are widely used,
particularly on Asian sites).

There might be a case for having the encoding type as a configurable
option (where one option is no decoding so the application can handle
it) - otherwise it has to be left to the application where the app
designers can reverse whatever encoding they have chosen.

**
Regarding the encoding, I think it could be helpful to be able to set a main
"encoding" config key in MyApp.pm and all the other modules check for it and
use it if it is set (templates, HTML::FormFu forms, config  files, possibly
even DBIC classes). Those modules could use another encoding if this main
encoding type key is overwritten.

Octavian


_______________________________________________
List: Catalyst@...
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@.../
Dev site: http://dev.catalyst.perl.org/

Re: Catalyst and UTF8 Chained URLs

by Rod Taylor-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This issue is mostly why I sent it to the list instead of as a bug report.

I couldn't figure out how to shove it into Catalyst::Plugin::Unicode and putting it directly in Chained.pm isn't the right place in itself.

I like the idea of specifying an encoding in the config file and making Catalyst::Plugin::Unicode disappear but that is probably beyond my abilities to make happen.

On Wed, Sep 9, 2009 at 05:45, Jon Schutz <jon%2Bcatalyst@...> wrote:
How would this patch affect systems that choose to encode their URLs in
something other than UTF-8? (Other character encodings are widely used,
particularly on Asian sites).

There might be a case for having the encoding type as a configurable
option (where one option is no decoding so the application can handle
it) - otherwise it has to be left to the application where the app
designers can reverse whatever encoding they have chosen.

--
Jon Schutz                        My tech notes http://notes.jschutz.net
Chief Technology Officer                        http://www.youramigo.com
YourAmigo




On 09/09/2009 11:15 AM, Rod Taylor wrote:
> I have a URL which includes UTF8 components which are in the chained
> (CaptureArgs) position. The escape mechanism is supposed encode each
> byte of the UTF8 sequence individually when creating the URL (which
> Catalyst seems to do) and reverse this on the way in.
>
> The below adjustment appears to fix the issue I'm having with URL
> arguments not being decoded on the way in.
>
> This may not be the right place to handle it as it could break older
> applications which work around arguments not being decoded properly to UTF8.
>
> This makes a good test URL.
>
> $uri = $c->uri_for( $cont->action_for($action), ['VÜ Living', '他们有理
> 性和良心' ]);
>
> Should result in VÜ Living and 他们有理性和良心 being in the first and
> second captured arguments.
>
>
> *** Chained.pm.orig     Tue Sep  8 21:10:10 2009
> --- Chained.pm  Tue Sep  8 21:35:38 2009
> ***************
> *** 168,174 ****
>
>       $request->action("/${action}");
>       $request->match("/${action}");
> !     $request->captures($captures);
>       $c->action($action);
>       $c->namespace( $action->namespace );
>
> --- 168,187 ----
>
>       $request->action("/${action}");
>       $request->match("/${action}");
> !
> !     # Decode Captures
> !     my $decodedCaptures = [];
> !     if ($captures && @$captures) {
> !       for my $arg (@{$captures}) {
> !             $arg =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
> !             utf8::downgrade($arg);
> !             utf8::decode($arg);
> !             push(@{$decodedCaptures}, $arg);
> !       }
> !     }
> !     $request->captures($decodedCaptures);
>       $c->action($action);
>       $c->namespace( $action->namespace );
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> List: Catalyst@...
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst@.../
> Dev site: http://dev.catalyst.perl.org/

_______________________________________________
List: Catalyst@...
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@.../
Dev site: http://dev.catalyst.perl.org/


_______________________________________________
List: Catalyst@...
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@.../
Dev site: http://dev.catalyst.perl.org/