CLDR import for src/share/*def definitions

View: New views
7 Messages — Rating Filter:   Alert me  

CLDR import for src/share/*def definitions

by Edwin Groothuis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have been playing with the CLDR database to see if I can get the
monetary, time, messages and numerical definitions right. The CLDR
is in UTF-8, I use iconv to translate to other charactersets.

So far most of it is fine, except (subset of issues):

- A couple of languages are not known (es_FR, es_IT)

- A couple of languages have a different abbrevation
        no_NO   -> nb_NO nn_NO
        *_YU    -> *_RS

- A couple of charactersets are not known to iconv:
        (CP1131 ISCII-DECV)

- A couple of translations went wrong:
        Writing to fi_FI in ISO8859-1
        Could not convert currency_symbol from UTF-8 to ISO8859-1

- It is not clear what the difference between "Long month names (as
  in a date)" and "Long month names (without case ending)" is. (could
  be my language problem :-)

The biggest problem so far is not a technical: WHich data is more
authoritative - The one in the CLDR database or the one we have
collected over the years from various sources and people?

Another problem I'm facing is that there is little documentation
on what the format of the *def/ files is, it is mostly a UTSL
approach in lib/libc/locale, but that doesn't show me neither if I
can safely replace (for example in uk_UA)
     # yesstr
    -<E2><D0><DA>
    +<E2><D0><DA>:<E2>:<C2><B0><BA>:<C2>:yes:y:YES:Y

So euhm... Is there anybody who wants to give their opinion or
wisdom about things, please speak up, I need it :-)

Edwin

--
Edwin Groothuis Website: http://www.mavetju.org/
edwin@... Weblog:  http://www.mavetju.org/weblog/
_______________________________________________
freebsd-i18n@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@..."

Re: CLDR import for src/share/*def definitions

by Wolfgang Zenker-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

* Edwin Groothuis <edwin@...> [090702 00:37]:
> I have been playing with the CLDR database to see if I can get the
> monetary, time, messages and numerical definitions right. The CLDR
> is in UTF-8, I use iconv to translate to other charactersets.

> So far most of it is fine, except (subset of issues):

> - A couple of languages are not known (es_FR, es_IT)

what do you mean by "not known"? Both locales specify a spanish language
locale, one for use in  France and one in Italy. There might even be no
different language contructs from es_ES, just different ways to format
dates ore something like that.

> - A couple of languages have a different abbrevation
>         no_NO   -> nb_NO nn_NO
>         *_YU    -> *_RS

You best ask a norwegian about the locales to use for Norway; for the
second line I'ld go with *_RS, as the former Yugoslavia hac split into
aming others the Republic of Serbia (which i assume the code RS is supposed
to mean).

> - A couple of charactersets are not known to iconv:
>         (CP1131 ISCII-DECV)

Never heard of them.

> - A couple of translations went wrong:
>         Writing to fi_FI in ISO8859-1
>         Could not convert currency_symbol from UTF-8 to ISO8859-1

Thats because Finland uses the Euro and the Euro-sign does not exist
in ISO8859-1; you have to use ISO8859-15 if you want the € currency
symbol in an ISO8859-* charset.

> - It is not clear what the difference between "Long month names (as
>   in a date)" and "Long month names (without case ending)" is. (could
>   be my language problem :-)

I don't know either; could you give an example where the two are different?
Preferably in a language where we find some speaekers here :-)
I do speak english, german, latin and some arabic, if that is of any help.

> The biggest problem so far is not a technical: WHich data is more
> authoritative - The one in the CLDR database or the one we have
> collected over the years from various sources and people?

I _think_ the CLDR has been relying on people coming forward with
information the same way that we have, so I consider neither _the_
authoritative source. Best to just list conflicting entries and ask
around for locals on the lit that could help resolve conflicts.

> Another problem I'm facing is that there is little documentation
> on what the format of the *def/ files is, it is mostly a UTSL
> approach in lib/libc/locale, but that doesn't show me neither if I
> can safely replace (for example in uk_UA)
>      # yesstr
>     -<E2><D0><DA>
>     +<E2><D0><DA>:<E2>:<C2><B0><BA>:<C2>:yes:y:YES:Y

Sorry, no clue, can't help you here.

> So euhm... Is there anybody who wants to give their opinion or
> wisdom about things, please speak up, I need it :-)

I don't know if it was of any help, but here you got my 2¢

Wolfgang
_______________________________________________
freebsd-i18n@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@..."

Re: CLDR import for src/share/*def definitions

by J. Vicente Carrasco -Bixen- :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Wolfgang Zenker(e)k dio:

> Hi,
>
> * Edwin Groothuis <edwin@...> [090702 00:37]:
>> I have been playing with the CLDR database to see if I can get the
>> monetary, time, messages and numerical definitions right. The CLDR
>> is in UTF-8, I use iconv to translate to other charactersets.
>
>> So far most of it is fine, except (subset of issues):
>
>> - A couple of languages are not known (es_FR, es_IT)
>
> what do you mean by "not known"? Both locales specify a spanish language
> locale, one for use in  France and one in Italy. There might even be no
> different language contructs from es_ES, just different ways to format
> dates ore something like that.
>

Hello:

Maybe I'm missing something, but as a Spanish native speaker I can't
understand why we (the Spanish-speaking community) could need locales as
es_FR or es_IT or even why would be necessary for the French and Italian
speaking world. Is something like en_ES, no_IT or de_RU. Why the heck is
that? ;-)


Best regards.




--
===================================================
        J. Vicente Carrasco -- Bixen
   carvay at [tikismikis.org | es.FreeBSD.org]
Current Basque Beret: Spanish FDP Translationmeister
====================================================
                                                   --
_______________________________________________
freebsd-i18n@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@..."

Re: CLDR import for src/share/*def definitions

by Wolfgang Zenker-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

* J. Vicente Carrasco -Bixen-  <carvay@...>[090706 20:42]:
> Wolfgang Zenker(e)k dio:
>> * Edwin Groothuis <edwin@...> [090702 00:37]:
>>> I have been playing with the CLDR database to see if I can get the
>>> monetary, time, messages and numerical definitions right. The CLDR
>>> is in UTF-8, I use iconv to translate to other charactersets.

>>> So far most of it is fine, except (subset of issues):

>>>- A couple of languages are not known (es_FR, es_IT)

>> what do you mean by "not known"? Both locales specify a spanish language
>> locale, one for use in  France and one in Italy. There might even be no
>> different language contructs from es_ES, just different ways to format
>> dates ore something like that.

> Maybe I'm missing something, but as a Spanish native speaker I can't
> understand why we (the Spanish-speaking community) could need locales as
> es_FR or es_IT or even why would be necessary for the French and Italian
> speaking world. Is something like en_ES, no_IT or de_RU. Why the heck is
> that? ;-)

as I understand it, that would be locales for use by spanish speaking
communities living in France and Italy, respectively. So someone who
uses one of these locales gets e.g. system messages in spanish but
dates formatted according to the customs in France or something like
that.

Wolfgang
_______________________________________________
freebsd-i18n@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@..."

Re: CLDR import for src/share/*def definitions

by J. Vicente Carrasco -Bixen- :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Wolfgang Zenker(e)k dio:

> Hi,
>
> * J. Vicente Carrasco -Bixen-  <carvay@...>[090706 20:42]:
>> Wolfgang Zenker(e)k dio:
>>> * Edwin Groothuis <edwin@...> [090702 00:37]:
>>>> I have been playing with the CLDR database to see if I can get the
>>>> monetary, time, messages and numerical definitions right. The CLDR
>>>> is in UTF-8, I use iconv to translate to other charactersets.
>
>>>> So far most of it is fine, except (subset of issues):
>
>>>> - A couple of languages are not known (es_FR, es_IT)
>
>>> what do you mean by "not known"? Both locales specify a spanish language
>>> locale, one for use in  France and one in Italy. There might even be no
>>> different language contructs from es_ES, just different ways to format
>>> dates ore something like that.
>
>> Maybe I'm missing something, but as a Spanish native speaker I can't
>> understand why we (the Spanish-speaking community) could need locales as
>> es_FR or es_IT or even why would be necessary for the French and Italian
>> speaking world. Is something like en_ES, no_IT or de_RU. Why the heck is
>> that? ;-)
>
> as I understand it, that would be locales for use by spanish speaking
> communities living in France and Italy, respectively. So someone who
> uses one of these locales gets e.g. system messages in spanish but
> dates formatted according to the customs in France or something like
> that.
>


Spanish speaking communities living in France and Italy... and using
FreeBSD. Suddendly my pet project of Basque localization (eu_ES and
maybe eu_FR) sounds more and more interesting ;-)




--
===================================================
        J. Vicente Carrasco -- Bixen
   carvay at [tikismikis.org | es.FreeBSD.org]
Current Basque Beret: Spanish FDP Translationmeister
====================================================
                                                   --
_______________________________________________
freebsd-i18n@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@..."

Re: CLDR import for src/share/*def definitions

by Wolfgang Zenker-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

* J. Vicente Carrasco -Bixen-  <carvay@...> [090706 20:58]:
> Wolfgang Zenker(e)k dio:

>>* J. Vicente Carrasco -Bixen-  <carvay@...>[090706 20:42]:
>>>Wolfgang Zenker(e)k dio:
>>>>* Edwin Groothuis <edwin@...> [090702 00:37]:
>>>>>I have been playing with the CLDR database to see if I can get the
>>>>>monetary, time, messages and numerical definitions right. The CLDR
>>>>>is in UTF-8, I use iconv to translate to other charactersets.

>>>>>So far most of it is fine, except (subset of issues):

>>>>>- A couple of languages are not known (es_FR, es_IT)

>>>>what do you mean by "not known"? Both locales specify a spanish language
>>>>locale, one for use in  France and one in Italy. There might even be no
>>>>different language contructs from es_ES, just different ways to format
>>>>dates ore something like that.

>>>Maybe I'm missing something, but as a Spanish native speaker I can't
>>>understand why we (the Spanish-speaking community) could need locales as
>>>es_FR or es_IT or even why would be necessary for the French and Italian
>>>speaking world. Is something like en_ES, no_IT or de_RU. Why the heck is
>>>that? ;-)

>>as I understand it, that would be locales for use by spanish speaking
>>communities living in France and Italy, respectively. So someone who
>>uses one of these locales gets e.g. system messages in spanish but
>>dates formatted according to the customs in France or something like
>>that.

> Spanish speaking communities living in France and Italy... and using
> FreeBSD. Suddendly my pet project of Basque localization (eu_ES and
> maybe eu_FR) sounds more and more interesting ;-)

sounds cool, go ahead!

Wolfgang
_______________________________________________
freebsd-i18n@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@..."

Re: CLDR import for src/share/*def definitions

by Christian Weisgerber :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Edwin Groothuis <edwin@...> wrote:

> - It is not clear what the difference between "Long month names (as
>   in a date)" and "Long month names (without case ending)" is. (could
>   be my language problem :-)

Take Polish for example.  The month of July is "lipiec", but today's
date is "7 lipca"--literally "7 of July", with the month name in
genitive case.  You need different forms when referring to the plain
month and to the month as part of a date.

--
Christian "naddy" Weisgerber                          naddy@...

_______________________________________________
freebsd-i18n@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-i18n
To unsubscribe, send any mail to "freebsd-i18n-unsubscribe@..."