The Source for Java Technology Collaboration

Home » java.net Forums » GlassFish » Metro and JAXB

Thread: JAXB can generate Illegal XML characters

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 2 - Last Post: Jul 19, 2009 2:56 AM by: bratwurst
ranboii

Posts: 1
JAXB can generate Illegal XML characters
Posted: Mar 17, 2009 7:06 PM
  Click to reply to this thread Reply

Apparently there are some XML characters that are illegal to have in the content area of an XML document, and there is no way to "escape" them to make them legal. For example, 0x1F is an illegal character. So if a java object happens to contain that character (which is legal in a Java string), JAXB will quietly generate invalid XML!

Then when we try to unmarshal that XML back into an object, an exception is thrown.

This means that the ability of JAXB to correctly marshal and unmarshal an object depends on what data happens to be in the fields of the object! That doesn't seem like a very safe or general solution. It means, for example, that if we marshal an object and store it in a database, then months later when we go to read the object, we will suddenly find out that it is unreadable! Yikes!

One obvious solution to this is that JAXB should be throwing an exception when the object is marshalled, i.e., it should detect that it cannot generate XML to represent the object.

Another, more powerful option would be for JAXB to use its own escaping mechanism to escape any characters that would be illegal in XML. Then when unmarshalling, it would look for such escaped characters in strings and "unescape" them (using the same mechanism), so that it could guarantee that any Java string can be preserved through the marshalling/unmarshalling process.

We could, of course, use our own wrapper to do a similar thing, but it would seem much safer for JAXB to be able to do this.

Is there a bug logged on this? Is there a standard workaround for it?

dansiviter

Posts: 36
Re: JAXB can generate Illegal XML characters
Posted: Mar 18, 2009 8:47 AM   in response to: ranboii
  Click to reply to this thread Reply

I found another one a few days ago. JAXB uses a BigDecimal for 'xsd:decimal' types and it uses BigDecimal#toString() to marshal to XML. However, the #toString() method allows exponents which is not supported by the W3C XML Schema Spec (http://www.w3.org/TR/1999/WD-xmlschema-2-19990924/#decimal). I couldn't find anywhere that this had been raised as a fault.

I got round it using a javax.xml.bind.annotation.adapters.XmlAdapter which used BigDecimal#toPlainString.

bratwurst

Posts: 1
Re: JAXB can generate Illegal XML characters
Posted: Jul 19, 2009 2:56 AM   in response to: ranboii
  Click to reply to this thread Reply

Firstly I agree with you that this behaviour is strange... It would be nice if Kohsuke would comment on it.

There is a way to supply you own CharacterEscaper thou. Implement the interface
com.sun.xml.bind.marshaller.CharacterEscapeHandler and pass it to the marshaller with
marshaller.setProperty("com.sun.xml.bind.characterEscapeHandler", myEscaperObject);

I've attached my implementation snatched from Kohsuke orginal (which didn't escape the ' and " characters properly).




 XML java.net RSS