|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
Character / char oddnessHi all,
I have to convert a small piece of Java to Groovy to include it into a command line groovy script. The part I am working on converts a String into XML, escaping the five XML entities, leaving the ASCII range >= 32 till <= 126 untouched and unicoding Ӓ the rest. You can paste below code right into groovyConsole to give it a try. The assertion will fail once I try to encode the ¼ which should be a ¼ according to the old java code. It is definitely not in the ASCII >=32 <=126 range, but still I can see the println 'in ascii' is called, which means the code block if (!isXmlEntity && ch >= 32 && ch <= 126) { output.append(ch) println "in ascii ${ch}" continue } is being executed. The question: why? This currently breaks my conversion... am I hitting a Groovy gotcha? Cheers Sven def test = [ 'this is a test' : 'this is a test', '<>\'"&' : '<>'"&', '¼' : '¼', '©¼ÇÈÉÊËÐÑßàäç™' : '©¼ÇÈÉÊËÐÑßàäç™' //output from original EntityCodec.XML ] test.each { input, expected -> assert (expected == XMLCodec.encode(input)) } class XMLCodec { static encode = { original -> if (original == null) return null char[] originalChars = original.toCharArray() StringBuffer output = new StringBuffer() for (char ch: originalChars) { Character character = new Character(ch); def isXmlEntity = false if ( ch == '&' || ch == '"' || ch == '\'' || ch == '<' || ch == '>') isXmlEntity = true if (!isXmlEntity && ch >= 32 && ch <= 126) { output.append(ch) println "in ascii ${ch}" continue } if (isXmlEntity) output.append('&') switch(ch) { case '&': output.append('amp');break case '"': output.append('quot');break case '\'': output.append('apos');break case '<': output.append('lt');break case '>': output.append('gt');break default: println "in default" output.append("&#") output.append((int)ch) output.append(';') } if (isXmlEntity) output.append(';') } def result = output.toString(); println result return result } } -- Sven Haiges sven.haiges@... Yahoo Messenger / Skype: hansamann Personal Homepage, Wiki & Blog: http://www.svenhaiges.de Subscribe to the Grails Podcast: http://feeds.grailspodcast.com/grailspodcast http://www.grailspodcast.com |
|
|
Re: Character / char oddnessHi Sven
I cut-and-pasted your script into a new file (ANSI encoded), ran under Groovy 1.5.7 / JDK 1.6.0_11 and it seemed to work fine: in ascii t in ascii h in ascii i in ascii s in ascii in ascii i in ascii s in ascii in ascii a in ascii in ascii t in ascii e in ascii s in ascii t this is a test <>'"& in default ¼ in default in default in default in default in default in default in default in default in default in default in default in default in default in default ©¼ÇÈÉÊËÐÑßàäç™ Something about your source file encoding? Jason |
|
|
Re: Character / char oddnessHi Jason,
I found the issue now: the default charset/encoding in Mac Os X is MacRoman. Although I saved the .groovy file using UTF-8, something seems to go wrong when it is being read from the file system By calling groovy --encoding utf-8 script.groovy it worked. It's an annoying Mac OS Specialty... Thanx for looking into this! Cheers Sven On Wed, Jul 1, 2009 at 6:31 PM, Jason Stell <jstell@...> wrote: Hi Sven -- Sven Haiges sven.haiges@... Yahoo Messenger / Skype: hansamann Personal Homepage, Wiki & Blog: http://www.svenhaiges.de Subscribe to the Grails Podcast: http://feeds.grailspodcast.com/grailspodcast http://www.grailspodcast.com |
| Free embeddable forum powered by Nabble | Forum Help |