|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
Doctype and encoding storageHi,
I have been using eXist daily on production systems from about 2 years and I have just discovered today that I am not able to store and retrieve an XML document without information loss ;-) First of all, the following line is removed when I store documents having it as first line: <?xml version="1.0" encoding="UTF-16"?> Document is stored without error (via Interactive client or Java XML-RPC API) but the line disappeared once retrieved (via Interactive client or Java XML-RPC API). Via REST interface the line comes back but with UTF-8 encoding (I suppose it as been rewritten). I notice the same behaviour with DOCTYPE definitions like this one: <!DOCTYPE dictionnaire SYSTEM "http://toto.ca/blabla.dtd"> Note that the given DTD isn't recorded in catalog.xml and I am currently using eXist 1.2.6 on Linux (Ubuntu Server). Am I wrong? How to keep these lines when storing document in eXist database? The encoding is very important for me: how to be sure that eXist will store/retrieve it correctly ? Thank you very much in advance for your help! Best regards, Benoit (mercibe) ------------------------------------------------------------------------------ _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
|
|
Re: Doctype and encoding storageBenoit,
From the REST interface with XQuery you could use the serialization options to output the encoding and document definition - http://exist-db.org/xquery.html#serialization That should enable you to make your output look the same as your input. Cheers Adam. 2009/6/23 Benoit Mercier <Benoit.Mercier@...>: > Hi, > > I have been using eXist daily on production systems from about 2 years > and I have just discovered today that I am not able to store and > retrieve an XML document without information loss ;-) > > First of all, the following line is removed when I store documents > having it as first line: > > <?xml version="1.0" encoding="UTF-16"?> > > Document is stored without error (via Interactive client or Java XML-RPC > API) but the line disappeared once retrieved (via Interactive client or > Java XML-RPC API). Via REST interface the line comes back but with > UTF-8 encoding (I suppose it as been rewritten). > > I notice the same behaviour with DOCTYPE definitions like this one: > > <!DOCTYPE dictionnaire SYSTEM "http://toto.ca/blabla.dtd"> > > Note that the given DTD isn't recorded in catalog.xml and I am currently > using eXist 1.2.6 on Linux (Ubuntu Server). > > Am I wrong? How to keep these lines when storing document in eXist > database? The encoding is very important for me: how to be sure that > eXist will store/retrieve it correctly ? > > Thank you very much in advance for your help! > > Best regards, > > Benoit (mercibe) > > ------------------------------------------------------------------------------ > _______________________________________________ > Exist-open mailing list > Exist-open@... > https://lists.sourceforge.net/lists/listinfo/exist-open > -- Adam Retter eXist Developer { United Kingdom } adam@... irc://irc.freenode.net/existdb ------------------------------------------------------------------------------ _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
|
|
Re: Doctype and encoding storage> Am I wrong? How to keep these lines when storing document in eXist
> database? The encoding is very important for me: how to be sure that > eXist will store/retrieve it correctly ? Neither the XML declaration nor the doctype are part of the document model. With respect to the character encoding, eXist relies on Java's unicode handling, so once the text of the document has been parsed, it will be processed as unicode, no matter what encoding the file used on disk. When writing out a document, it is the job of the serializer to choose an output encoding. Use the serialization options to determine which encoding is used. eXist also stores the doctype declaration in the document's metadata, but will not print it out by default when serializing the document (mainly to avoid potential issues with internal entity declarations). DTD's are always a bit problematic as they are themselves not XML. Wolfgang ------------------------------------------------------------------------------ _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
|
|
Re: Doctype and encoding storage> eXist also stores the doctype declaration in the document's metadata,
> but will not print it out by default when serializing the document > (mainly to avoid potential issues with internal entity declarations). > DTD's are always a bit problematic as they are themselves not XML. Is it possible to override the default and have eXist output the doctype declaration during serialization? Thanks. John Craft ------------------------------------------------------------------------------ _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
| Free embeddable forum powered by Nabble | Forum Help |