« Return to Thread: Character / char oddness

Character / char oddness

by Sven Haiges-3 :: Rate this Message:

Reply to Author | View in Thread

Hi all,

I have to convert a small piece of Java to Groovy to include it into a command line groovy script. The part I am working on converts a String into XML, escaping the five XML entities, leaving the ASCII range >= 32 till <= 126 untouched and unicoding &#1234; the rest.

You can paste below code right into groovyConsole to give it a try.

The assertion will fail once I try to encode the ¼ which should be a &#188; according to the old java code. It is definitely not in the ASCII >=32 <=126 range, but still I can see the println 'in ascii' is called, which means the code block

           if (!isXmlEntity && ch >= 32 && ch <= 126)
            {
                output.append(ch)
                println "in ascii ${ch}"
                continue
            }

is being executed.

The question: why? This currently breaks my conversion... am I hitting a Groovy gotcha?

Cheers
Sven



def test = [
    'this is a test' : 'this is a test',
    '<>\'"&' : '&lt;&gt;&apos;&quot;&amp;',
    '¼' : '&#188;',
    '©¼ÇÈÉÊËÐÑßàäç™' : '&#169;&#188;&#199;&#200;&#201;&#202;&#203;&#208;&#209;&#223;&#224;&#228;&#231;&#8482;' //output from original EntityCodec.XML
]

test.each { input, expected ->
    assert (expected == XMLCodec.encode(input))
}

class XMLCodec
{
    static encode = { original ->
   
        if (original == null)
            return null
           
        char[] originalChars = original.toCharArray()
        StringBuffer output = new StringBuffer()
       
        for (char ch: originalChars)
        {
            Character character = new Character(ch);
            def isXmlEntity = false
           
            if ( ch == '&' || ch == '"' || ch == '\'' || ch == '<' || ch == '>')
                isXmlEntity = true         
       
            if (!isXmlEntity && ch >= 32 && ch <= 126)
            {
                output.append(ch)
                println "in ascii ${ch}"
                continue
            }
       
            if (isXmlEntity)
                output.append('&')               
               
            switch(ch)
            {
                case '&': output.append('amp');break
                case '"': output.append('quot');break
                case '\'': output.append('apos');break
                case '<': output.append('lt');break
                case '>': output.append('gt');break 
                default:
                     println "in default"
                     output.append("&#")
                     output.append((int)ch)
                     output.append(';')                                                      
            }
           

            if (isXmlEntity)
                output.append(';')
          
        }

        def result = output.toString();
        println result
        return result      
           
  
    }

}

--
Sven Haiges
sven.haiges@...

Yahoo Messenger / Skype: hansamann
Personal Homepage, Wiki & Blog: http://www.svenhaiges.de

Subscribe to the Grails Podcast:
http://feeds.grailspodcast.com/grailspodcast
http://www.grailspodcast.com

 « Return to Thread: Character / char oddness