Hangul/Jamo character, normalization and English collation

View: New views
3 Messages — Rating Filter:   Alert me  

Hangul/Jamo character, normalization and English collation

by isabelle.moulinier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hangul/Jamo character, normalization and English collation

Hello,
I assume it is it expected that the collation keys (eveb using en_US as the locale) for the following terms are identical, since the sequence of Jamo characters and the Hangul characters normalize to one another except under the FCD normalization

P횝ALKY
P
ALKY
Would it be possible to generate a different collation key for each term in this example? I don't see any option with regards to Normalization and ignoring Hangul in the Collator. I am using ICU4J 3.2 at this point.

Thanks
Isabelle

.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
icu-support mailing list - icu-support@...
To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-support

Re: Hangul/Jamo character, normalization and English collation

by Vladimir Weinstein :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Can you write a tailoring that makes Jamos different from Hanguls? I haven't inspected the code, but I'm pretty sure that the default behavior is to always treat Hanguls as sequences of Jamos.

Regards,
v.

On Apr 24, 2007, at 9:55 AM, <isabelle.moulinier@...> <isabelle.moulinier@...> wrote:

Hello,
I assume it is it expected that the collation keys (eveb using en_US as the locale) for the following terms are identical, since the sequence of Jamo characters and the Hangul characters normalize to one another except under the FCD normalization

P횝ALKY
P
ALKY
Would it be possible to generate a different collation key for each term in this example? I don't see any option with regards to Normalization and ignoring Hangul in the Collator. I am using ICU4J 3.2 at this point.

Thanks
Isabelle

.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
icu-support mailing list - icu-support@...


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
icu-support mailing list - icu-support@...
To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-support

Re: Hangul/Jamo character, normalization and English collation

by Mark Davis-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
The code always treats Hangul characters as a sequence of Jamos.

Isabelle, the normalization options only affect cases where reordering needs to take place, that is, where the text is not FCD (there is more information under http://google.com/search?q=fcd+normalization).

Mark

On 5/3/07, weiv <weiv.icu@...> wrote:
Can you write a tailoring that makes Jamos different from Hanguls? I haven't inspected the code, but I'm pretty sure that the default behavior is to always treat Hanguls as sequences of Jamos.

Regards,
v.

On Apr 24, 2007, at 9:55 AM, <isabelle.moulinier@...> <isabelle.moulinier@...> wrote:

Hello,
I assume it is it expected that the collation keys (eveb using en_US as the locale) for the following terms are identical, since the sequence of Jamo characters and the Hangul characters normalize to one another except under the FCD normalization

P횝ALKY
P
ALKY
Would it be possible to generate a different collation key for each term in this example? I don't see any option with regards to Normalization and ignoring Hangul in the Collator. I am using ICU4J 3.2 at this point.

Thanks
Isabelle

.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
icu-support mailing list - icu-support@...


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
icu-support mailing list - icu-support@...
To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-support




--
Mark
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
icu-support mailing list - icu-support@...
To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-support