Hello,
I'm not sure if this is a bug or a feature, but I thought I would report
it anyway... I have attached (also reproduced below) a simple example
that illustrates the problem. I have tested this with Java 1.6EE, and
JDOM's Jan 9th, 2009 nightly build as well as the standard 1.1
release.
In this example, I am trying to prevent the expansion of the entity
"−" in an XHTML document that is being read in and
then immediately written out. I create an instance of SAXBuilder,
setExpandEntities(false), then call the build() method on an input XHTML
doc. For simplicity, I then use an instance of XMLOutputter to print the
parsed document to standard out (Even though I don't think it's necessary
for standard out, I also make sure the encoding is consistent between the
Format and the OutputStream and that it is a common "US-ASCII"
format).
The original XHTML document uses the entity:
−
But, the resulting XHTML printed to standard out shows:
−−
Apparently, setting "setExpandEntities(false)" had the effect
of duplicating the character. I would expect that setting expand entities
to 'false' would simply leave the "−", without
duplicating it in US-ASCII formatting.
This isn't a big problem because if the default value, 'true', is used
for entity expansion, the resulting output will simply contain
"−" instead of duplicating the character. Even
though the original entity encoding has changed, the resulting output
will still behave/appear the same as the original, which is probably
what's normally required.
- Thanks for any feedback & Happy 2009,
- David W.
======= INPUT XHTML DOCUMENT START =======
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl"
href="http://www.w3.org/Math/XSL/pmathml.xsl"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/2000/REC-xhtml1-20000126/DTD/xhtml1-strict.dtd">
<html>
<head>
</head>
<body>
<p>−</p>
</body>
</html>
======= INPUT XHTML DOCUMENT END =======
======= TEST JAVA CODE START =======
import java.io.File;
import java.io.OutputStreamWriter;
import org.jdom.Document;
import org.jdom.input.SAXBuilder;
import org.jdom.output.Format;
import org.jdom.output.XMLOutputter;
public class Test {
public
static void main(String[] args) throws Exception{
File fileInput =
new File("testEntity.xml");
Document
doc;
SAXBuilder b =
new SAXBuilder();
b.setIgnoringElementContentWhitespace(true);
b.setExpandEntities(false);
doc =
b.build(fileInput);
doc.getDocType().setInternalSubset(null);
XMLOutputter
outputter = new XMLOutputter();
Format format =
Format.getPrettyFormat();
format.setEncoding("US-ASCII");
outputter.setFormat(format);
outputter.output(doc, new
OutputStreamWriter(System.out,format.getEncoding()));
}
}
======= TEST JAVA CODE END =====
_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@...