« Return to Thread: Ruby 1.8 - character encoding

Re: Ruby 1.8 - character encoding

by James Gray-7 :: Rate this Message:

Reply to Author | View in Thread

On Jul 7, 2009, at 7:28 AM, Thomas Thomassen wrote:

> Searching the net I found some hacks that converted UTF-8 into single
> byte characters: str_utf8.unpack('U*').pack('C*')

What you are doing there is transcoding from UTF-8 to Latin-1 (or  
ISO-8859-1).  Here's the proof:

$ ruby -KU -r iconv -e 'utf8 = "æøåÆØÅ"; p  
utf8.unpack("U*").pack("C*") == Iconv.conv("ISO-8859-1", "UTF-8", utf8)'
true

James Edward Gray II

 « Return to Thread: Ruby 1.8 - character encoding