xml.XML.load encoding problem

View: New views
6 Messages — Rating Filter:   Alert me  

xml.XML.load encoding problem

by GA-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello guys,

I have this piece of code:

                     url.openConnection match { case conn:  
HttpURLConnection => val node = {
                                 conn.setRequestMethod("GET")
                                 conn.connect
                                 xml.XML.load(conn.getInputStream)
                             }

There are several sources. All of them send a XML document, but in a  
couple of sources, the XML documents declare themselves as UTF-8, but  
they are not. Because of that I get the following error in the  
XML.load line:

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException
: Invalid byte 2 of 3-byte UTF-8 sequence.
         at  
com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte
(UTF8Reader.java:684)
         at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read
(UTF8Reader.java:405)


How can I solve this problem?

Thanks in advance,

GA


Re: xml.XML.load encoding problem

by Daniel Sobral :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Have you tried the Xhtml loader?

On Tue, Nov 3, 2009 at 1:17 PM, GA <my_lists@...> wrote:
Hello guys,

I have this piece of code:

                   url.openConnection match { case conn: HttpURLConnection => val node = {
                               conn.setRequestMethod("GET")
                               conn.connect
                               xml.XML.load(conn.getInputStream)
                           }

There are several sources. All of them send a XML document, but in a couple of sources, the XML documents declare themselves as UTF-8, but they are not. Because of that I get the following error in the XML.load line:

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 3-byte UTF-8 sequence.
       at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
       at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:405)


How can I solve this problem?

Thanks in advance,

GA




--
Daniel C. Sobral

Veni, vidi, veterni.

Re: xml.XML.load encoding problem

by GA-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

No, I haven't. How should I use it? I do not see any load method for this object.

Thanks,

GA


On Nov 3, 2009, at 5:00 PM, Daniel Sobral wrote:

Have you tried the Xhtml loader?

On Tue, Nov 3, 2009 at 1:17 PM, GA <my_lists@...> wrote:
Hello guys,

I have this piece of code:

                   url.openConnection match { case conn: HttpURLConnection => val node = {
                               conn.setRequestMethod("GET")
                               conn.connect
                               xml.XML.load(conn.getInputStream)
                           }

There are several sources. All of them send a XML document, but in a couple of sources, the XML documents declare themselves as UTF-8, but they are not. Because of that I get the following error in the XML.load line:

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 3-byte UTF-8 sequence.
       at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
       at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:405)


How can I solve this problem?

Thanks in advance,

GA




--
Daniel C. Sobral

Veni, vidi, veterni.


Re: xml.XML.load encoding problem

by Daniel Sobral :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Sorry. scala.xml.parsing.XhtmlParser.

On Tue, Nov 3, 2009 at 5:38 PM, GA <my_lists@...> wrote:
No, I haven't. How should I use it? I do not see any load method for this object.

Thanks,

GA


On Nov 3, 2009, at 5:00 PM, Daniel Sobral wrote:

Have you tried the Xhtml loader?

On Tue, Nov 3, 2009 at 1:17 PM, GA <my_lists@...> wrote:
Hello guys,

I have this piece of code:

                   url.openConnection match { case conn: HttpURLConnection => val node = {
                               conn.setRequestMethod("GET")
                               conn.connect
                               xml.XML.load(conn.getInputStream)
                           }

There are several sources. All of them send a XML document, but in a couple of sources, the XML documents declare themselves as UTF-8, but they are not. Because of that I get the following error in the XML.load line:

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 3-byte UTF-8 sequence.
       at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
       at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:405)


How can I solve this problem?

Thanks in advance,

GA




--
Daniel C. Sobral

Veni, vidi, veterni.




--
Daniel C. Sobral

Veni, vidi, veterni.

Re: xml.XML.load encoding problem

by Walter Chang :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

you should use XML.load(reader: java.io.Reader) instead.  by providing a reader to load(), you can specify the file encoding when constructing the reader:

XML.load(new java.io.InputStreamReader(conn.getInputStream, "UTF-16"))

On Tue, Nov 3, 2009 at 11:17 PM, GA <my_lists@...> wrote:
Hello guys,

I have this piece of code:

                   url.openConnection match { case conn: HttpURLConnection => val node = {
                               conn.setRequestMethod("GET")
                               conn.connect
                               xml.XML.load(conn.getInputStream)
                           }

There are several sources. All of them send a XML document, but in a couple of sources, the XML documents declare themselves as UTF-8, but they are not. Because of that I get the following error in the XML.load line:

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 3-byte UTF-8 sequence.
       at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
       at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:405)


How can I solve this problem?

Thanks in advance,

GA




--
.......__o
.......\<,
....( )/ ( )...

Re: xml.XML.load encoding problem

by GA-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thank you.

XML.load(new java.io.InputStreamReader(conn.getInputStream, "UTF-8"))

These line fixed the problem.

GA



On Nov 4, 2009, at 4:15 AM, Walter Chang wrote:

> XML.load(new java.io.InputStreamReader(conn.getInputStream, "UTF-16"))