[Bug localedata/13096] New: fi_FI collation: [vwåäöþ] and [=?UTF-8?Q?=C3=90=C3=9C?=] are in wrong ranges

View: New views
3 Messages — Rating Filter:   Alert me  

[Bug localedata/13096] New: fi_FI collation: [vwåäöþ] and [=?UTF-8?Q?=C3=90=C3=9C?=] are in wrong ranges

by Bugzilla from sourceware-bugzilla@sourceware.org :: Rate this Message:

| View Threaded | Show Only this Message

http://sourceware.org/bugzilla/show_bug.cgi?id=13096

             Bug #: 13096
           Summary: fi_FI collation: [vwåäöþ] and [ÐÜ] are in wrong ranges
           Product: glibc
           Version: 2.14
            Status: NEW
          Severity: minor
          Priority: P2
         Component: localedata
        AssignedTo: libc-locales@...
        ReportedBy: lauri.kentta@...
    Classification: Unclassified


Created attachment 5900
  --> http://sourceware.org/bugzilla/attachment.cgi?id=5900
Proposed fix

In the Finnish locale (fi_FI), a couple of lower-case letters (namely [vwåäöþ])
have been put between upper-case letters. The converse is true for upper-case
letters Ð and Ü. This causes unexpected results in grep, for example:

export LC_COLLATE=fi_FI.UTF-8
echo v | grep -E '[a-z]' # actual: empty, expected: "v"
echo v | grep -E '[A-Z]' # actual: "v", expected: empty
echo x | grep -E '[a-z]' # actual: "x", expected: "x"
echo x | grep -E '[A-Z]' # actual: empty, expected: empty

I'm aware that the locales don't guarantee much about character ranges, but
this behaviour is clearly illogical, serves no purpose and might break
somebody's scripts.

This has been fixed in Debian years ago.
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=441026

If I read their bug report correctly, this has been right in the past (glibc
2.3.6), probably broken by mistake.

Proposed fix attached.

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug localedata/13096] fi_FI collation: [vwåäöþ] and [=?UTF-8?Q?=C3=90=C3=9C?=] are in wrong ranges

by Bugzilla from sourceware-bugzilla@sourceware.org :: Rate this Message:

| View Threaded | Show Only this Message

http://sourceware.org/bugzilla/show_bug.cgi?id=13096

Marko Myllynen <myllynen at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |myllynen at redhat dot com

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug localedata/13096] fi_FI collation: [vwåäöþ] and [=?UTF-8?Q?=C3=90=C3=9C?=] are in wrong ranges

by Bugzilla from sourceware-bugzilla@sourceware.org :: Rate this Message:

| View Threaded | Show Only this Message

http://sourceware.org/bugzilla/show_bug.cgi?id=13096

Ulrich Drepper <drepper.fsp at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |drepper.fsp at gmail dot
                   |                            |com
         Resolution|                            |FIXED

--- Comment #1 from Ulrich Drepper <drepper.fsp at gmail dot com> 2011-12-23 02:24:39 UTC ---
I added the patch.

--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.