|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
Proper parsing of InputStreams with encoding attribute in XML prologue.Hi there,
We are using SimpleXML in Carrot2 (Carrot2.org), it's great stuff, thanks. We would like to point at one potential issue that could be easily fixed: currently, the Persister class forces UTF-8 encoding on byte streams, which may not be correct (let's say the XML is provided by users and not generated by SimpleXML). The relevant section of Persister could be easily fixed to rely on the XML parser's character detection and decoding. So: public <T> T read(Class<? extends T> type, InputStream source) throws Exception { - return (T)read(type, source, "utf-8"); + return (T)read(type, NodeBuilder.read(source)); } and then in NodeBuilder add a method that relies on the XML parser's decoding: + public static InputNode read(InputStream source) throws Exception { + return read(factory.createXMLEventReader(source)); + } The entire patch against 2.1.4 is attached. We would appreciate if this could be integrated into the next version (or at least if NodeBuilder's read(XMLEventReader) could be made public to allow alternative parsing strategies): private static InputNode read(XMLEventReader source) throws Exception { Dawid [byte-streams.patch] diff --git a/src/org/simpleframework/xml/core/Persister.java b/src/org/simpleframework/xml/core/Persister.java index 43e2c91..eac218d 100644 --- a/src/org/simpleframework/xml/core/Persister.java +++ b/src/org/simpleframework/xml/core/Persister.java @@ -451,7 +451,7 @@ public class Persister implements Serializer { * @throws Exception if the object cannot be fully deserialized */ public <T> T read(Class<? extends T> type, InputStream source) throws Exception { - return (T)read(type, source, "utf-8"); + return (T)read(type, NodeBuilder.read(source)); } /** @@ -464,6 +464,8 @@ public class Persister implements Serializer { * @param type this is the class type to be deserialized from XML * @param source this provides the source of the XML document * @param charset this is the character set to read the XML with + * (if the encoding is unknown, you should rely on the XML parser + * instead -- see {@link #read(Class, InputStream)}). * * @return the object deserialized from the XML document * diff --git a/src/org/simpleframework/xml/stream/NodeBuilder.java b/src/org/simpleframework/xml/stream/NodeBuilder.java index e855002..238dcad 100644 --- a/src/org/simpleframework/xml/stream/NodeBuilder.java +++ b/src/org/simpleframework/xml/stream/NodeBuilder.java @@ -22,8 +22,7 @@ package org.simpleframework.xml.stream; import javax.xml.stream.XMLInputFactory; import javax.xml.stream.XMLEventReader; -import java.io.Reader; -import java.io.Writer; +import java.io.*; /** * The <code>NodeBuilder</code> object is used to create either an @@ -69,6 +68,19 @@ public final class NodeBuilder { * @param source this contains the contents of the XML source * * @throws Exception thrown if there is an I/O exception + */ + public static InputNode read(InputStream source) throws Exception { + return read(factory.createXMLEventReader(source)); + } + + /** + * This is used to create an <code>InputNode</code> that can be + * used to read XML from the specified reader. The reader will + * be positioned at the root element in the XML document. + * + * @param source this contains the contents of the XML source + * + * @throws Exception thrown if there is an I/O exception */ private static InputNode read(XMLEventReader source) throws Exception { return new NodeReader(source).readRoot(); ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Simple-support mailing list Simple-support@... https://lists.sourceforge.net/lists/listinfo/simple-support |
|
|
Re: Proper parsing of InputStreams with encoding attribute in XML prologue.Sounds good, ill add this in.
Niall Gallagher RBS Global Banking & Markets Office: +44 7879498724 -----Original Message----- From: Dawid Weiss [mailto:dawid.weiss@...] Sent: 29 October 2009 18:36 To: simple-support@... Subject: [Simple-support] Proper parsing of InputStreams with encoding attribute in XML prologue. Hi there, We are using SimpleXML in Carrot2 (Carrot2.org), it's great stuff, thanks. We would like to point at one potential issue that could be easily fixed: currently, the Persister class forces UTF-8 encoding on byte streams, which may not be correct (let's say the XML is provided by users and not generated by SimpleXML). The relevant section of Persister could be easily fixed to rely on the XML parser's character detection and decoding. So: public <T> T read(Class<? extends T> type, InputStream source) throws Exception { - return (T)read(type, source, "utf-8"); + return (T)read(type, NodeBuilder.read(source)); } and then in NodeBuilder add a method that relies on the XML parser's decoding: + public static InputNode read(InputStream source) throws Exception { + return read(factory.createXMLEventReader(source)); + } The entire patch against 2.1.4 is attached. We would appreciate if this could be integrated into the next version (or at least if NodeBuilder's read(XMLEventReader) could be made public to allow alternative parsing strategies): private static InputNode read(XMLEventReader source) throws Exception { Dawid *********************************************************************************** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorised and regulated by the Financial Services Authority. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer. Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc does not accept responsibility for changes made to this message after it was sent. Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data. No responsibility is accepted by The Royal Bank of Scotland plc in this regard and the recipient should carry out such virus and other checks as it considers appropriate. Visit our website at www.rbs.com *********************************************************************************** ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Simple-support mailing list Simple-support@... https://lists.sourceforge.net/lists/listinfo/simple-support |
|
|
Re: Proper parsing of InputStreams with encoding attribute in XML prologue.Thanks Niall.
Dawid On Fri, Oct 30, 2009 at 10:04 AM, <niall.gallagher@...> wrote: > Sounds good, ill add this in. > > > Niall Gallagher > RBS Global Banking & Markets > Office: +44 7879498724 > > -----Original Message----- > From: Dawid Weiss [mailto:dawid.weiss@...] > Sent: 29 October 2009 18:36 > To: simple-support@... > Subject: [Simple-support] Proper parsing of InputStreams with encoding attribute in XML prologue. > > Hi there, > > We are using SimpleXML in Carrot2 (Carrot2.org), it's great stuff, thanks. > > We would like to point at one potential issue that could be easily > fixed: currently, the Persister class forces > UTF-8 encoding on byte streams, which may not be correct (let's say the XML is provided by users and not generated by SimpleXML). The relevant section of Persister could be easily fixed to rely on the XML parser's character detection and decoding. So: > > public <T> T read(Class<? extends T> type, InputStream source) throws Exception { > - return (T)read(type, source, "utf-8"); > + return (T)read(type, NodeBuilder.read(source)); > } > > and then in NodeBuilder add a method that relies on the XML parser's decoding: > > + public static InputNode read(InputStream source) throws Exception { > + return read(factory.createXMLEventReader(source)); > + } > > The entire patch against 2.1.4 is attached. We would appreciate if this could be integrated into the next version (or at least if NodeBuilder's read(XMLEventReader) could be made public to allow alternative parsing strategies): > > private static InputNode read(XMLEventReader source) throws Exception { > > Dawid > > *********************************************************************************** > The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. > Authorised and regulated by the Financial Services Authority. > > This e-mail message is confidential and for use by the > addressee only. If the message is received by anyone other > than the addressee, please return the message to the sender > by replying to it and then delete the message from your > computer. Internet e-mails are not necessarily secure. The > Royal Bank of Scotland plc does not accept responsibility for > changes made to this message after it was sent. > > Whilst all reasonable care has been taken to avoid the > transmission of viruses, it is the responsibility of the recipient to > ensure that the onward transmission, opening or use of this > message and any attachments will not adversely affect its > systems or data. No responsibility is accepted by The > Royal Bank of Scotland plc in this regard and the recipient should carry > out such virus and other checks as it considers appropriate. > > Visit our website at www.rbs.com > > *********************************************************************************** > > ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Simple-support mailing list Simple-support@... https://lists.sourceforge.net/lists/listinfo/simple-support |
|
|
Re: Proper parsing of InputStreams with encoding attribute in XML prologue.Hi Niall,
I see a new version of Simple is out and, sadly, it does not contain this forced-UTF-8 patch. Is there any schedule on when this could be integrated? Our stuff sort of depends on this change and we'd hate to fork and maintain SimpleXML internally just for two lines of code... Dawid On Fri, Oct 30, 2009 at 10:04 AM, <niall.gallagher@...> wrote: > Sounds good, ill add this in. > > > Niall Gallagher > RBS Global Banking & Markets > Office: +44 7879498724 > > -----Original Message----- > From: Dawid Weiss [mailto:dawid.weiss@...] > Sent: 29 October 2009 18:36 > To: simple-support@... > Subject: [Simple-support] Proper parsing of InputStreams with encoding attribute in XML prologue. > > Hi there, > > We are using SimpleXML in Carrot2 (Carrot2.org), it's great stuff, thanks. > > We would like to point at one potential issue that could be easily > fixed: currently, the Persister class forces > UTF-8 encoding on byte streams, which may not be correct (let's say the XML is provided by users and not generated by SimpleXML). The relevant section of Persister could be easily fixed to rely on the XML parser's character detection and decoding. So: > > public <T> T read(Class<? extends T> type, InputStream source) throws Exception { > - return (T)read(type, source, "utf-8"); > + return (T)read(type, NodeBuilder.read(source)); > } > > and then in NodeBuilder add a method that relies on the XML parser's decoding: > > + public static InputNode read(InputStream source) throws Exception { > + return read(factory.createXMLEventReader(source)); > + } > > The entire patch against 2.1.4 is attached. We would appreciate if this could be integrated into the next version (or at least if NodeBuilder's read(XMLEventReader) could be made public to allow alternative parsing strategies): > > private static InputNode read(XMLEventReader source) throws Exception { > > Dawid > > *********************************************************************************** > The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. > Authorised and regulated by the Financial Services Authority. > > This e-mail message is confidential and for use by the > addressee only. If the message is received by anyone other > than the addressee, please return the message to the sender > by replying to it and then delete the message from your > computer. Internet e-mails are not necessarily secure. The > Royal Bank of Scotland plc does not accept responsibility for > changes made to this message after it was sent. > > Whilst all reasonable care has been taken to avoid the > transmission of viruses, it is the responsibility of the recipient to > ensure that the onward transmission, opening or use of this > message and any attachments will not adversely affect its > systems or data. No responsibility is accepted by The > Royal Bank of Scotland plc in this regard and the recipient should carry > out such virus and other checks as it considers appropriate. > > Visit our website at www.rbs.com > > *********************************************************************************** > > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Simple-support mailing list Simple-support@... https://lists.sourceforge.net/lists/listinfo/simple-support |
|
|
Re: Proper parsing of InputStreams with encoding attribute in XML prologue.Hi,
I did not change the method for InputStream on this one, however I did add NodeBuilder.read(InputStream) to return an InputNode from the stream. I need to test a bit more before I change to this in the persister. It should be possible to use NodeBuilder.read(InputStream) to create an InputNode you can pass to the Persister.read(Class,InputNode) method. Ill try get it in for the next release. Niall Niall Gallagher RBS Global Banking & Markets Office: +44 2070851454 -----Original Message----- From: Dawid Weiss [mailto:dawid.weiss@...] Sent: 23 November 2009 10:13 To: GALLAGHER, Niall, GBM Cc: simple-support@... Subject: Re: [Simple-support] Proper parsing of InputStreams with encoding attribute in XML prologue. Hi Niall, I see a new version of Simple is out and, sadly, it does not contain this forced-UTF-8 patch. Is there any schedule on when this could be integrated? Our stuff sort of depends on this change and we'd hate to fork and maintain SimpleXML internally just for two lines of code... Dawid On Fri, Oct 30, 2009 at 10:04 AM, <niall.gallagher@...> wrote: > Sounds good, ill add this in. > > > Niall Gallagher > RBS Global Banking & Markets > Office: +44 7879498724 > > -----Original Message----- > From: Dawid Weiss [mailto:dawid.weiss@...] > Sent: 29 October 2009 18:36 > To: simple-support@... > Subject: [Simple-support] Proper parsing of InputStreams with encoding attribute in XML prologue. > > Hi there, > > We are using SimpleXML in Carrot2 (Carrot2.org), it's great stuff, thanks. > > We would like to point at one potential issue that could be easily > fixed: currently, the Persister class forces > UTF-8 encoding on byte streams, which may not be correct (let's say the XML is provided by users and not generated by SimpleXML). The relevant section of Persister could be easily fixed to rely on the XML parser's character detection and decoding. So: > > public <T> T read(Class<? extends T> type, InputStream source) > throws Exception { > - return (T)read(type, source, "utf-8"); > + return (T)read(type, NodeBuilder.read(source)); > } > > and then in NodeBuilder add a method that relies on the XML parser's decoding: > > + public static InputNode read(InputStream source) throws Exception > + { > + return read(factory.createXMLEventReader(source)); > + } > > The entire patch against 2.1.4 is attached. We would appreciate if this could be integrated into the next version (or at least if NodeBuilder's read(XMLEventReader) could be made public to allow alternative parsing strategies): > > private static InputNode read(XMLEventReader source) throws Exception > { > > Dawid > > ********************************************************************** > ************* The Royal Bank of Scotland plc. Registered in Scotland > No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. > Authorised and regulated by the Financial Services Authority. > > This e-mail message is confidential and for use by the addressee only. > If the message is received by anyone other than the addressee, please > return the message to the sender by replying to it and then delete the > message from your computer. Internet e-mails are not necessarily > secure. The Royal Bank of Scotland plc does not accept responsibility > for changes made to this message after it was sent. > > Whilst all reasonable care has been taken to avoid the transmission of > viruses, it is the responsibility of the recipient to ensure that the > onward transmission, opening or use of this message and any > attachments will not adversely affect its systems or data. No > responsibility is accepted by The Royal Bank of Scotland plc in this > regard and the recipient should carry out such virus and other checks > as it considers appropriate. > > Visit our website at www.rbs.com > > ********************************************************************** > ************* > > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Simple-support mailing list Simple-support@... https://lists.sourceforge.net/lists/listinfo/simple-support |
|
|
Re: Proper parsing of InputStreams with encoding attribute in XML prologue.Thanks. I really don't see how it could break things (unless you have
a broken XML parser in Java, which may occasionally happen, unfortunately). I'll look into the public method you mentioned -- should do for us for the moment. Dawid On Mon, Nov 23, 2009 at 11:16 AM, <niall.gallagher@...> wrote: > Hi, > > I did not change the method for InputStream on this one, however I did add NodeBuilder.read(InputStream) to return an InputNode from the stream. I need to test a bit more before I change to this in the persister. It should be possible to use NodeBuilder.read(InputStream) to create an InputNode you can pass to the Persister.read(Class,InputNode) method. Ill try get it in for the next release. > > Niall > > > Niall Gallagher > RBS Global Banking & Markets > Office: +44 2070851454 > > -----Original Message----- > From: Dawid Weiss [mailto:dawid.weiss@...] > Sent: 23 November 2009 10:13 > To: GALLAGHER, Niall, GBM > Cc: simple-support@... > Subject: Re: [Simple-support] Proper parsing of InputStreams with encoding attribute in XML prologue. > > Hi Niall, > > I see a new version of Simple is out and, sadly, it does not contain this forced-UTF-8 patch. Is there any schedule on when this could be integrated? Our stuff sort of depends on this change and we'd hate to fork and maintain SimpleXML internally just for two lines of code... > > Dawid > > On Fri, Oct 30, 2009 at 10:04 AM, <niall.gallagher@...> wrote: >> Sounds good, ill add this in. >> >> >> Niall Gallagher >> RBS Global Banking & Markets >> Office: +44 7879498724 >> >> -----Original Message----- >> From: Dawid Weiss [mailto:dawid.weiss@...] >> Sent: 29 October 2009 18:36 >> To: simple-support@... >> Subject: [Simple-support] Proper parsing of InputStreams with encoding attribute in XML prologue. >> >> Hi there, >> >> We are using SimpleXML in Carrot2 (Carrot2.org), it's great stuff, thanks. >> >> We would like to point at one potential issue that could be easily >> fixed: currently, the Persister class forces >> UTF-8 encoding on byte streams, which may not be correct (let's say the XML is provided by users and not generated by SimpleXML). The relevant section of Persister could be easily fixed to rely on the XML parser's character detection and decoding. So: >> >> public <T> T read(Class<? extends T> type, InputStream source) >> throws Exception { >> - return (T)read(type, source, "utf-8"); >> + return (T)read(type, NodeBuilder.read(source)); >> } >> >> and then in NodeBuilder add a method that relies on the XML parser's decoding: >> >> + public static InputNode read(InputStream source) throws Exception >> + { >> + return read(factory.createXMLEventReader(source)); >> + } >> >> The entire patch against 2.1.4 is attached. We would appreciate if this could be integrated into the next version (or at least if NodeBuilder's read(XMLEventReader) could be made public to allow alternative parsing strategies): >> >> private static InputNode read(XMLEventReader source) throws Exception >> { >> >> Dawid >> >> ********************************************************************** >> ************* The Royal Bank of Scotland plc. Registered in Scotland >> No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. >> Authorised and regulated by the Financial Services Authority. >> >> This e-mail message is confidential and for use by the addressee only. >> If the message is received by anyone other than the addressee, please >> return the message to the sender by replying to it and then delete the >> message from your computer. Internet e-mails are not necessarily >> secure. The Royal Bank of Scotland plc does not accept responsibility >> for changes made to this message after it was sent. >> >> Whilst all reasonable care has been taken to avoid the transmission of >> viruses, it is the responsibility of the recipient to ensure that the >> onward transmission, opening or use of this message and any >> attachments will not adversely affect its systems or data. No >> responsibility is accepted by The Royal Bank of Scotland plc in this >> regard and the recipient should carry out such virus and other checks >> as it considers appropriate. >> >> Visit our website at www.rbs.com >> >> ********************************************************************** >> ************* >> >> > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Simple-support mailing list Simple-support@... https://lists.sourceforge.net/lists/listinfo/simple-support |
| Free embeddable forum powered by Nabble | Forum Help |