|
View:
New views
2 Messages
—
Rating Filter:
Alert me
|
|
|
StAX from sourceHi
Having previously been a user of various SAX libraries, DOM (and JDOM), I was very happy to have found StAX and Woodstox ! It feels a lot more natural to me, even after quite a long time with those other implementations. Anyway, I am at the point where I have a need to strip out extraneous characters (linefeeds, long sections of repeating whitespace and tabs, etc) from within some of my elements. I had hoped to understand how do this (if possible) by downloading the source jars and begining to debug in Eclipse by attaching source in the normal way in there. I have been through some of the javadoc and documentation but I can't see how I might do this with the existing implementations. 1. Is there a way of doing what I need in terms of stripping those extraneous characters ? For most of my elements, I am iterating through elements with nextTag() and getElementText / getElementAs... and I am not doing anything within the elements. Even if there is, I'd like to get my source build working - I may be able to provide help and input back to the project. 2. How do you build woodstox AND the stax2-3.0-api.jar from source ? I downloaded the woodstox source and tried "ant dist -f build.xml" but I get errors because it tries to import some other build files (for osgi) which aren't there. 3. I can't find all of the source for the stax2-3.0-api.jar - this contains classes in org.codehaus.stax2 namespace. The org.codehaus.stax2.ri code is in the woodstox source download but not the rest of the stuff in that jar (evt, io, etc). Thanks all |
|
|
Re: StAX from sourceOn Mon, Sep 21, 2009 at 3:58 AM, Chris Faulkner
<chris.faulkner@...> wrote: > Hi > > Having previously been a user of various SAX libraries, DOM (and JDOM), I > was very happy to have found StAX and Woodstox ! It feels a lot more natural > to me, even after quite a long time with those other implementations. Thanks! > Anyway, I am at the point where I have a need to strip out extraneous > characters (linefeeds, long sections of repeating whitespace and tabs, etc) > from within some of my elements. I had hoped to understand how do this (if Makes sense, yes. > possible) by downloading the source jars and begining to debug in Eclipse by > attaching source in the normal way in there. I have been through some of the > javadoc and documentation but I can't see how I might do this with the > existing implementations. There are no options directly doing this, unless you have a DTD that defines what whitespace is "ignorable". XML by default assumes no textual content is meaningless, so it is reported as regular CHARACTERS. However, DTD can define content model for elements which does not include any CDATA; if so, any white space included must then be ignorable white space (for indentation); if so, it will be reported as SPACE. So filtering these out would be one way to achieve this goal, iff there's a DTD to use. Or theoretically Schema/RNG; but I don't remember whethere that has been tested to work this way (both can declare element-only content, but whether that gets properly propagated through validation API). > 1. Is there a way of doing what I need in terms of stripping those > extraneous characters ? For most of my elements, I am iterating through > elements with nextTag() and getElementText / getElementAs... and I am not One way to trim leading/trailing space would be to just do trim() on results of getElementText()? One problem is that getElementText() only works for text-only content. For what it's worth, I personally use StaxMate for much of my XML processing; it builds on Stax API, implements fully streaming extensions that allow somewhat more convenient access. And features like "advanced" white space processing would fit nicely within that framework (no such functionality yet exists in StaxMate either tho). > doing anything within the elements. Even if there is, I'd like to get my > source build working - I may be able to provide help and input back to the > project. Yes, source build should work. And help is always appreciated! So: > 2. How do you build woodstox AND the stax2-3.0-api.jar from source ? I > downloaded the woodstox source and tried "ant dist -f build.xml" but I get > errors because it tries to import some other build files (for osgi) which > aren't there. > > 3. I can't find all of the source for the stax2-3.0-api.jar - this contains > classes in org.codehaus.stax2 namespace. The org.codehaus.stax2.ri code is > in the woodstox source download but not the rest of the stuff in that jar > (evt, io, etc). Is this from trunk or one of branches? All sources should of course be included, so there may be an error in build setup, probably due to refactoring done to split jars (core vs others). :-/ One known problem is that the OSGi build task is bit tricky to get working; I just couldn't get it working without adding task jar in my local Ant lib dir (so any help to make that work would be well appreciated!). Jar in question is under lib/osgi/, and its declared in and invoked from build-osgi.xml. I am not aware of other problems; but it is of course possible something might be missing from source package. Perhaps you could download sources from svn repo to see what is missing? -+ Tatu +- --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
| Free embeddable forum powered by Nabble | Forum Help |