Working round 'invalid byte sequence'

View: New views
6 Messages — Rating Filter:   Alert me  

Working round 'invalid byte sequence'

by Adam Akhtar-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


I am a very amateur Rubyist who, amongst other things, likes to use a
simple Rails app to query my company's MySQL config database.  The
server I now use to do this has got 1.9.1 and Rails 2.3.3.  I've now hit
the 'problems' related to 1.9 and string encoding, which means that when
Rails try to display, say, E acute characters, it throws an invalid byte
sequence, namely
ArgumentError (invalid byte sequence in UTF-8):

Given that I only access the MySQL database over a private network and
with a read-only account, is there some simple and easy way to suppress
this issue?  Without being an expert in this area (obviously) I guess
that either I can try to "tell" Ruby to treat the MySQL data as an
encoding other than UTF-8 (I guess US-ASCII  but it could be trial and
error to work out what), and/or I could add some rescue code to find
(and ignore) bad byte sequences.  I've tried to find recipes for both
the above, but quickly get lost in the subtleties of it all!  Any and
all help appreciated.  Many thanks in advance.
--
Posted via http://www.ruby-forum.com/.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@...
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---


Re: Working round 'invalid byte sequence'

by Matt Jones :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message




On Nov 2, 5:31 pm, Toby Rodwell <rails-mailing-l...@...>
wrote:

> I am a very amateur Rubyist who, amongst other things, likes to use a
> simple Rails app to query my company's MySQL config database.  The
> server I now use to do this has got 1.9.1 and Rails 2.3.3.  I've now hit
> the 'problems' related to 1.9 and string encoding, which means that when
> Rails try to display, say, E acute characters, it throws an invalid byte
> sequence, namely
> ArgumentError (invalid byte sequence in UTF-8):
>
> Given that I only access the MySQL database over a private network and
> with a read-only account, is there some simple and easy way to suppress
> this issue?  Without being an expert in this area (obviously) I guess
> that either I can try to "tell" Ruby to treat the MySQL data as an
> encoding other than UTF-8 (I guess US-ASCII  but it could be trial and
> error to work out what), and/or I could add some rescue code to find
> (and ignore) bad byte sequences.  I've tried to find recipes for both
> the above, but quickly get lost in the subtleties of it all!  Any and
> all help appreciated.  Many thanks in advance.

I'd check with the whoever admins the MySQL DB to find out what
character set it's actually using. I think you can then tell the
adapter to translate. Best guess is either US ASCII, or (more likely)
Windows-1252 pretending to be ASCII.

--Matt Jones

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@...
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---


Re: Working round 'invalid byte sequence'

by Adam Akhtar-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Matt Jones wrote:

> On Nov 2, 5:31�pm, Toby Rodwell <rails-mailing-l...@...>
> wrote:
>> this issue? �Without being an expert in this area (obviously) I guess
>> that either I can try to "tell" Ruby to treat the MySQL data as an
>> encoding other than UTF-8 (I guess US-ASCII �but it could be trial and
>> error to work out what), and/or I could add some rescue code to find
>> (and ignore) bad byte sequences. �I've tried to find recipes for both
>> the above, but quickly get lost in the subtleties of it all! �Any and
>> all help appreciated. �Many thanks in advance.
>
> I'd check with the whoever admins the MySQL DB to find out what
> character set it's actually using. I think you can then tell the
> adapter to translate. Best guess is either US ASCII, or (more likely)
> Windows-1252 pretending to be ASCII.
>
> --Matt Jones

Many thanks for the reply Matt.  I used the console to determine that
the db is serving up ASCII-8BIT

>>e = Equipment.find(:first, :conditions => ['id = ?', 1234])
>> e.name.encoding
=> #<Encoding:ASCII-8BIT>

I then set the encoding in /config/database.yml to 'ascii' which
although it can't display special characters, at least it shows the page
with "?" in place of the accented charaters.  I tried setting encoding
to "ascii-8bit" and varieties of this, but each time Rails complained -
so if anyone can tell me how to indicate ASCII-8BIT I'd be grateful.

--
Posted via http://www.ruby-forum.com/.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@...
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---


Re: Working round 'invalid byte sequence'

by Adam Akhtar-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Toby Rodwell wrote:
[...]

> Many thanks for the reply Matt.  I used the console to determine that
> the db is serving up ASCII-8BIT
>
>>>e = Equipment.find(:first, :conditions => ['id = ?', 1234])
>>> e.name.encoding
> => #<Encoding:ASCII-8BIT>
>
> I then set the encoding in /config/database.yml to 'ascii' which
> although it can't display special characters, at least it shows the page
> with "?" in place of the accented charaters.  I tried setting encoding
> to "ascii-8bit" and varieties of this, but each time Rails complained -
> so if anyone can tell me how to indicate ASCII-8BIT I'd be grateful.

This doesn't solve your immediate problem, but...if your host locks the
DB in ASCII 8-bit and you can't change it, then find a new host.  That
encoding is inappropriate for real work. :)

Best,
--
Marnen Laibow-Koser
http://www.marnen.org
marnen@...
--
Posted via http://www.ruby-forum.com/.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@...
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---


Re: Working round 'invalid byte sequence'

by Frederick Cheung-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message




On Nov 3, 10:09 pm, Marnen Laibow-Koser <rails-mailing-l...@andreas-
s.net> wrote:

> Toby Rodwell wrote:
>
> [...]
>
> > Many thanks for the reply Matt.  I used the console to determine that
> > the db is serving up ASCII-8BIT
>
> >>>e = Equipment.find(:first, :conditions => ['id = ?', 1234])
> >>> e.name.encoding
> > => #<Encoding:ASCII-8BIT>
>
> > I then set the encoding in /config/database.yml to 'ascii' which
> > although it can't display special characters, at least it shows the page
> > with "?" in place of the accented charaters.  I tried setting encoding
> > to "ascii-8bit" and varieties of this, but each time Rails complained -
> > so if anyone can tell me how to indicate ASCII-8BIT I'd be grateful.
>
> This doesn't solve your immediate problem, but...if your host locks the
> DB in ASCII 8-bit and you can't change it, then find a new host.  That
> encoding is inappropriate for real work. :)
>

This is not about the database itself this is to do with the
interaction between the mysql driver and the new string encoding
schemes - strings in ruby 1.9 are encoding aware and from what I
gather the mysql driver creates strings with the ascii-8bit encoding
regardless of their actual encoding (my very vague understanding is
that ascii-8bit is sort of pseudo encoding that doesn't actually mean
ascii - it just means raw bytes)

There is quite a lot of discussion on lighthouse here:
https://rails.lighthouseapp.com/projects/8994/tickets/2476-ascii-8bit-encoding-of-query-results-in-rails-232-and-ruby-191#ticket-2476-2
although no clear resolution that I could see. May provide some help
to Toby. One way out would be to fall back to ruby 1.8.x, where these
problems do not exist because strings are just dumb collections of
bytes.

Fred


> Best,
> --
> Marnen Laibow-Koserhttp://www.marnen.org
> mar...@...
> --
> Posted viahttp://www.ruby-forum.com/.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@...
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---


Re: Working round 'invalid byte sequence'

by Adam Akhtar-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Frederick Cheung wrote:
[...]
> This is not about the database itself this is to do with the
> interaction between the mysql driver and the new string encoding
> schemes - strings in ruby 1.9 are encoding aware and from what I
> gather the mysql driver creates strings with the ascii-8bit encoding
> regardless of their actual encoding

Oh, I didn't know that since I don't use 1.9 yet.


> Fred

Best,
--
Marnen Laibow-Koser
http://www.marnen.org
marnen@...
--
Posted via http://www.ruby-forum.com/.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@...
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@...
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---