[
https://issues.apache.org/jira/browse/LANG-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12725504#action_12725504 ]
Henri Yandell commented on LANG-507:
------------------------------------
I noticed in Java that \uuuuuuuu0022 is legal.
As an aside I wonder if the \u+ is a misunderstanding of regexes explaining that :) Or vice versa.
I think this is a very easy patch to do to text.translate.UnicodeEscaper if anyone wants to look at that. A boolean parameter to the constructor perhaps, with empty being false.
> StringEscapeUtils.unescapeJava should support \u+ notation
> ----------------------------------------------------------
>
> Key: LANG-507
> URL:
https://issues.apache.org/jira/browse/LANG-507> Project: Commons Lang
> Issue Type: Improvement
> Affects Versions: 2.4
> Reporter: Gregor B. Rosenauer
> Priority: Trivial
> Fix For: 3.0
>
>
> Currently, when trying to unescape a String with Unicode escapes in the common notation, e.g., \u+0022, I get a NumberFormatException:
> org.apache.commons.lang.exception.NestableRuntimeException: Unable to parse unicode value: +002
> Note that the number is also parsed incorrectly as it is shortened by one character (obviously, the parser gets confused by the '+' and only takes up to 4 bytes, so it neglects the last digit).
> I am aware that in Java, Unicode is escaped as "\u" followed by 4 bytes that represent the hex code in the Unicode map, but the \u+ notation is commonly used outside the Java world and it would be very handy if StringEscapeUtils supported that, at least as an option.
> Would you please consider adding this feature to 3.0?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.