l10n architecture proposal available for comments

View: New views
13 Messages — Rating Filter:   Alert me  

l10n architecture proposal available for comments

by Axel Hecht :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I have posted a short intro to a l10n archicture proposal to my blog,
please give it a read at http://blog.mozilla.com/axel/2007/02/01/meet-pete/.

There's more info on http://wiki.mozilla.org/L20n, with examples on
http://wiki.mozilla.org/L20n:Examples.

Please redirect your feedback to either me, or the newsgroups/mailing
lists or the wiki. If you choose the newsgroup/mailing list, try to make
it end up on .i18n? Thanks.

Looking forward to the comments.

Axel
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Gervase Markham :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Axel Hecht wrote:
> I have posted a short intro to a l10n archicture proposal to my blog,
> please give it a read at
> http://blog.mozilla.com/axel/2007/02/01/meet-pete/.

Axel: having read your proposal and looked at the code examples, I don't
quite see what this is _for_. It looks like a way of dynamically
localising web pages client-side, thereby making the client download the
text of every localisation! Which doesn't seem very useful to me, or
revelant to the Mozilla project.

What have I missed?

Gerv
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Axel Hecht :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Gervase Markham wrote:

> Axel Hecht wrote:
>> I have posted a short intro to a l10n archicture proposal to my blog,
>> please give it a read at
>> http://blog.mozilla.com/axel/2007/02/01/meet-pete/.
>
> Axel: having read your proposal and looked at the code examples, I don't
> quite see what this is _for_. It looks like a way of dynamically
> localising web pages client-side, thereby making the client download the
> text of every localisation! Which doesn't seem very useful to me, or
> revelant to the Mozilla project.
>
> What have I missed?

You missed that I'm using web apps for demos. Mostly because that way, I
don't have to ship an extension, or custom builds or anything for people
to look at how common problems can be solved. Both the problems and the
proposed solutions don't differ from a web app to a installed local
application. The target applications do include the Mozilla
applications, and a good deal of the requirements come from our
environment. But limiting the architecture to our platform will waste
momentum. Not that I intend to own 15 implementations of this, surely not.

For the architecture, it doesn't really matter whether you load the
locale files via http on demand, or use some php to process that server
side, or pull it via the chrome protocol from a statically installed
jar. Using web apps just chooses a platform for the prototypes that most
people are happy to read.

That said, I expect web apps to pick up this technology first, as they
have the least legacy infrastructure, or at least, have enough new
projects taking off a week.

Moving all of the Mozilla apps over will require that we have solved the
problems in localizing first, and then find out how to map the
architecture into a implementation that works for us and plays nice for
things like the xul cache and all the other bits.

Reading over the bits again, the target is in the very first paragraph
on http://wiki.mozilla.org/L20n.

Axel
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Rimas Kudelis-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Axel Hecht wrote:

> Hi,
>
> I have posted a short intro to a l10n archicture proposal to my blog,
> please give it a read at
> http://blog.mozilla.com/axel/2007/02/01/meet-pete/.
>
> There's more info on http://wiki.mozilla.org/L20n, with examples on
> http://wiki.mozilla.org/L20n:Examples.
>
> Please redirect your feedback to either me, or the newsgroups/mailing
> lists or the wiki. If you choose the newsgroup/mailing list, try to make
> it end up on .i18n? Thanks.
>
> Looking forward to the comments.
>
> Axel

Do you intend to support Grammatical cases [1] in any way? If yes,
then how do you see it?..

[1] http://en.wikipedia.org/wiki/Grammatical_case

RQ
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Axel Hecht :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rimas Kudelis wrote:

> Axel Hecht wrote:
>> Hi,
>>
>> I have posted a short intro to a l10n archicture proposal to my blog,
>> please give it a read at
>> http://blog.mozilla.com/axel/2007/02/01/meet-pete/.
>>
>> There's more info on http://wiki.mozilla.org/L20n, with examples on
>> http://wiki.mozilla.org/L20n:Examples.
>>
>> Please redirect your feedback to either me, or the newsgroups/mailing
>> lists or the wiki. If you choose the newsgroup/mailing list, try to make
>> it end up on .i18n? Thanks.
>>
>> Looking forward to the comments.
>>
>> Axel
>
> Do you intend to support Grammatical cases [1] in any way? If yes,
> then how do you see it?..
>
> [1] http://en.wikipedia.org/wiki/Grammatical_case
>
> RQ

Yes, it works pretty much like the gender example in
http://people.mozilla.com/~axel/l20n/js-l20n/sample-01.html.

If you create a sample output similar to one of the js examples I did, I
can create a corresponding lol file to demo that.

Sadly, neither German nor English really show this off canonically, so I
didn't do an example for that yet.

Axel
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Igor Tandetnik :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Axel Hecht <l10n@...> wrote:
> Rimas Kudelis wrote:
>> Do you intend to support Grammatical cases [1] in any way? If yes,
>> then how do you see it?..
>
> Yes, it works pretty much like the gender example in
> http://people.mozilla.com/~axel/l20n/js-l20n/sample-01.html.

The example seems to show how to change the surrounding sentence to
match the gender of a replacement word. The replacement determines the
gender, and the sentence has to adapt.

Case works the other way round: the structure of the sentence determines
the case a replacement word must take, and the spelling of the word
changes depending on case it needs to take to fit into a given sentence.

Often both grammatical categories (three actually - there's also number)
work together and require changing both the sentence and replacement
word for everything to agree.

Consider also this situation in Russian:

1 file = 1 ÆÁÊÌ
4 files = 4 ÆÁÊÌÁ
5 files = 5 ÆÁÊÌÏ×
21 file = 21 ÆÁÊÌ

The word "ÆÁÊÌ" (file) is in nominative singular in the first line,
genitive singular in the second line (yes, singular, even though there
are four of them), genitive plural in the third line, and nominative
singular again in the last line.

Igor Tandetnik


_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Axel Hecht :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Igor Tandetnik wrote:

> Axel Hecht <l10n@...> wrote:
>> Rimas Kudelis wrote:
>>> Do you intend to support Grammatical cases [1] in any way? If yes,
>>> then how do you see it?..
>> Yes, it works pretty much like the gender example in
>> http://people.mozilla.com/~axel/l20n/js-l20n/sample-01.html.
>
> The example seems to show how to change the surrounding sentence to
> match the gender of a replacement word. The replacement determines the
> gender, and the sentence has to adapt.
>
> Case works the other way round: the structure of the sentence determines
> the case a replacement word must take, and the spelling of the word
> changes depending on case it needs to take to fit into a given sentence.
>
> Often both grammatical categories (three actually - there's also number)
> work together and require changing both the sentence and replacement
> word for everything to agree.
>
> Consider also this situation in Russian:
>
> 1 file = 1 ÆÁÊÌ
> 4 files = 4 ÆÁÊÌÁ
> 5 files = 5 ÆÁÊÌÏ×
> 21 file = 21 ÆÁÊÌ
>
> The word "ÆÁÊÌ" (file) is in nominative singular in the first line,
> genitive singular in the second line (yes, singular, even though there
> are four of them), genitive plural in the third line, and nominative
> singular again in the last line.
>

That sounds like the standard plural forms as supported by gettext,
transforming the .2 notation in POs to real arrays and using an explicit
indexing.

The file would look similar to this:

<flockOfFiles[plural(N)]: ["${N}i ÆÁÊÌ", "${N}i ÆÁÊÌÁ", "${N}i ÆÁÊÌÏ×"]>

In more complex forms, you could make this a hash of arrays or an array
of hashes.

The current examples don't demo that, because I didn't implement the
expression part just yet. A hacked up version of this, using js' eval is
at http://people.mozilla.com/~axel/scratchpad/, which used a not good,
but adapted file format, yielding to
<shopping:
    ['Please select number of items',
     'No items ordered',
     'You have ordered one item.',
     'You have ordered $(itemCount)d items.'
     ][plural(itemCount)]
 >

Axel
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Rimas Kudelis-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Axel Hecht wrote:

> Rimas Kudelis wrote:
>> Do you intend to support Grammatical cases [1] in any way? If yes,
>> then how do you see it?..
>>
>> [1] http://en.wikipedia.org/wiki/Grammatical_case
>>
>> RQ
>
> Yes, it works pretty much like the gender example in
> http://people.mozilla.com/~axel/l20n/js-l20n/sample-01.html.
>
> If you create a sample output similar to one of the js examples I did, I
> can create a corresponding lol file to demo that.
>
> Sadly, neither German nor English really show this off canonically, so I
> didn't do an example for that yet.


OK, here's an example. From Microsoft Windows XP. Every user has a "My
Documents" folder there. Now in English, you can see "Jane Doe's
Documents" for user Jane Doe, "Jack Daniels' Documents" for user Jack
Daniels etc.

Meanwhile in Lithuanian, to be gramatically correct, it cannot be
"Rimas Kudelis' dokumentai" or "Rimas' Kudelis' dokumentai" or "Rimas
Kudelis dokumentai" or anything like that. It has to be "Rimo Kudelio
dokumentai".

And your gender example would sound like this:

Gerb. p. Kudeli,

šis tekstas lokalizuotas

Jūs užsisakėte vieną prekę.

Name: Kudelis
gender: male
Items: 1
Language: Lietuvių

Now, do you think such DYNAMIC casing would be possible?

RQ
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Axel Hecht :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rimas Kudelis wrote:

> Axel Hecht wrote:
>> Rimas Kudelis wrote:
>>> Do you intend to support Grammatical cases [1] in any way? If yes,
>>> then how do you see it?..
>>>
>>> [1] http://en.wikipedia.org/wiki/Grammatical_case
>>>
>>> RQ
>> Yes, it works pretty much like the gender example in
>> http://people.mozilla.com/~axel/l20n/js-l20n/sample-01.html.
>>
>> If you create a sample output similar to one of the js examples I did, I
>> can create a corresponding lol file to demo that.
>>
>> Sadly, neither German nor English really show this off canonically, so I
>> didn't do an example for that yet.
>
>
> OK, here's an example. From Microsoft Windows XP. Every user has a "My
> Documents" folder there. Now in English, you can see "Jane Doe's
> Documents" for user Jane Doe, "Jack Daniels' Documents" for user Jack
> Daniels etc.
>
> Meanwhile in Lithuanian, to be gramatically correct, it cannot be
> "Rimas Kudelis' dokumentai" or "Rimas' Kudelis' dokumentai" or "Rimas
> Kudelis dokumentai" or anything like that. It has to be "Rimo Kudelio
> dokumentai".
>
> And your gender example would sound like this:
>
> Gerb. p. Kudeli,
>
> šis tekstas lokalizuotas
>
> Jūs užsisakėte vieną prekę.
>
> Name: Kudelis
> gender: male
> Items: 1
> Language: Lietuvių
>
> Now, do you think such DYNAMIC casing would be possible?
>
> RQ

Probably still somewhat simplified, but here it goes:

<owner: "Rimas Kudelis"
  othercase: "Rimo Kudelio"
  gender: "male">

<myDocuments: "${owner.othercase} dokumentai">

Of course, you could have myDocuments depend on the gender of owner, for
example.

The tricky part comes when you actually realize that you want to get
"owner" from the user settings, at which point you probably bite a
bullet here, as I doubt that a user likes the idea of entering his name
in all genders, nor do I expect that there's a descent machine logic to
guess the grammar for a name.

Additional caveat, you don't really know whether the name of owner is
actually a lithuanian name or not, which might impact it, too. Like,
what would happen to "Axel Hecht" in this case?

Simply shows that "Axel's Documents" is a i18n bug. :-)

Axel
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Rimas Kudelis-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Axel Hecht wrote:

> Rimas Kudelis wrote:
>> Axel Hecht wrote:
>>> Rimas Kudelis wrote:
>>>> Do you intend to support Grammatical cases [1] in any way? If yes,
>>>> then how do you see it?..
>>>>
>>>> [1] http://en.wikipedia.org/wiki/Grammatical_case
>>>>
>>>> RQ
>>> Yes, it works pretty much like the gender example in
>>> http://people.mozilla.com/~axel/l20n/js-l20n/sample-01.html.
>>>
>>> If you create a sample output similar to one of the js examples I did, I
>>> can create a corresponding lol file to demo that.
>>>
>>> Sadly, neither German nor English really show this off canonically, so I
>>> didn't do an example for that yet.
>>
>>
>> OK, here's an example. From Microsoft Windows XP. Every user has a "My
>> Documents" folder there. Now in English, you can see "Jane Doe's
>> Documents" for user Jane Doe, "Jack Daniels' Documents" for user Jack
>> Daniels etc.
>>
>> Meanwhile in Lithuanian, to be gramatically correct, it cannot be
>> "Rimas Kudelis' dokumentai" or "Rimas' Kudelis' dokumentai" or "Rimas
>> Kudelis dokumentai" or anything like that. It has to be "Rimo Kudelio
>> dokumentai".
>>
>> And your gender example would sound like this:
>>
>> Gerb. p. Kudeli,
>>
>> šis tekstas lokalizuotas
>>
>> Jūs užsisakėte vieną prekę.
>>
>> Name: Kudelis
>> gender: male
>> Items: 1
>> Language: Lietuvių
>>
>> Now, do you think such DYNAMIC casing would be possible?
>>
>> RQ
>
> Probably still somewhat simplified, but here it goes:
>
> <owner: "Rimas Kudelis"
>  othercase: "Rimo Kudelio"
>  gender: "male">
>
> <myDocuments: "${owner.othercase} dokumentai">
>
> Of course, you could have myDocuments depend on the gender of owner, for
> example.
>
> The tricky part comes when you actually realize that you want to get
> "owner" from the user settings, at which point you probably bite a
> bullet here, as I doubt that a user likes the idea of entering his name
> in all genders, nor do I expect that there's a descent machine logic to
> guess the grammar for a name.

You're right. But we're discussing L10n 2.0 here, so why not dream a
little? ;)

> Additional caveat, you don't really know whether the name of owner is
> actually a lithuanian name or not, which might impact it, too.

I think, every language has certain rules how to form cases. Ideally,
our machine logic should at least work with the names of the context
language, i.e., if Lithuanian rules should at least work with
Lithuanian names (and they could apply some general casing, or no
casing, for names that don't grammatically fit into Lithuanian).

However, I think this also requires that we can specify some exceptions...

> Like, what would happen to "Axel Hecht" in this case?

This could be "Axelio Hechto dokumentai", for example. OR "Axel Hecht
dokumentai".

> Simply shows that "Axel's Documents" is a i18n bug. :-)

eh?

RQ
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Axel Hecht :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rimas Kudelis wrote:

> Axel Hecht wrote:
>> Rimas Kudelis wrote:
>>> Axel Hecht wrote:
>>>> Rimas Kudelis wrote:
>>>>> Do you intend to support Grammatical cases [1] in any way? If yes,
>>>>> then how do you see it?..
>>>>>
>>>>> [1] http://en.wikipedia.org/wiki/Grammatical_case
>>>>>
>>>>> RQ
>>>> Yes, it works pretty much like the gender example in
>>>> http://people.mozilla.com/~axel/l20n/js-l20n/sample-01.html.
>>>>
>>>> If you create a sample output similar to one of the js examples I did, I
>>>> can create a corresponding lol file to demo that.
>>>>
>>>> Sadly, neither German nor English really show this off canonically, so I
>>>> didn't do an example for that yet.
>>>
>>> OK, here's an example. From Microsoft Windows XP. Every user has a "My
>>> Documents" folder there. Now in English, you can see "Jane Doe's
>>> Documents" for user Jane Doe, "Jack Daniels' Documents" for user Jack
>>> Daniels etc.
>>>
>>> Meanwhile in Lithuanian, to be gramatically correct, it cannot be
>>> "Rimas Kudelis' dokumentai" or "Rimas' Kudelis' dokumentai" or "Rimas
>>> Kudelis dokumentai" or anything like that. It has to be "Rimo Kudelio
>>> dokumentai".
>>>
>>> And your gender example would sound like this:
>>>
>>> Gerb. p. Kudeli,
>>>
>>> šis tekstas lokalizuotas
>>>
>>> Jūs užsisakėte vieną prekę.
>>>
>>> Name: Kudelis
>>> gender: male
>>> Items: 1
>>> Language: Lietuvių
>>>
>>> Now, do you think such DYNAMIC casing would be possible?
>>>
>>> RQ
>> Probably still somewhat simplified, but here it goes:
>>
>> <owner: "Rimas Kudelis"
>>  othercase: "Rimo Kudelio"
>>  gender: "male">
>>
>> <myDocuments: "${owner.othercase} dokumentai">
>>
>> Of course, you could have myDocuments depend on the gender of owner, for
>> example.
>>
>> The tricky part comes when you actually realize that you want to get
>> "owner" from the user settings, at which point you probably bite a
>> bullet here, as I doubt that a user likes the idea of entering his name
>> in all genders, nor do I expect that there's a descent machine logic to
>> guess the grammar for a name.
>
> You're right. But we're discussing L10n 2.0 here, so why not dream a
> little? ;)
>
>> Additional caveat, you don't really know whether the name of owner is
>> actually a lithuanian name or not, which might impact it, too.
>
> I think, every language has certain rules how to form cases. Ideally,
> our machine logic should at least work with the names of the context
> language, i.e., if Lithuanian rules should at least work with
> Lithuanian names (and they could apply some general casing, or no
> casing, for names that don't grammatically fit into Lithuanian).
>
> However, I think this also requires that we can specify some exceptions...
>
>> Like, what would happen to "Axel Hecht" in this case?
>
> This could be "Axelio Hechto dokumentai", for example. OR "Axel Hecht
> dokumentai".
>
>> Simply shows that "Axel's Documents" is a i18n bug. :-)
>

If we'd actually specify something turing complete, we could create
macros to process language mangling for names, but really, that's tough.

But yes, there is going to be a general section on "classes" in l20n,
and we should talk about people there, too.

Note, I would expect the attribute name to be the Lithuanian name for
the grammar case, so if I had a German name in a database, filled in
with German grammar cases, I expect the attributes logic to fallback to
the main string, so I guess even my sample code would return

Axel Hecht dokumtai

for that case. But I haven't checked, obviously.

Axel
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Rimas Kudelis-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Axel Hecht wrote:
> If we'd actually specify something turing complete, we could create
>  macros to process language mangling for names, but really, that's
> tough.

I know, there are complex languages around...

> But yes, there is going to be a general section on "classes" in
> l20n, and we should talk about people there, too.
>
> Note, I would expect the attribute name to be the Lithuanian name
> for the grammar case, so if I had a German name in a database,
> filled in with German grammar cases, I expect the attributes logic
> to fallback to the main string, so I guess even my sample code
> would return
>
> Axel Hecht dokumtai
>
> for that case. But I haven't checked, obviously.

What if names of cases match for different languages? I would suggest
using an ISO prefix for the language/locale in the attribute names then.
Also, can attributes have non-ASCII letters in their names?..

Rimas

_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n

Re: l10n architecture proposal available for comments

by Axel Hecht :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rimas Kudelis wrote:

> Axel Hecht wrote:
>> If we'd actually specify something turing complete, we could create
>>  macros to process language mangling for names, but really, that's
>> tough.
>
> I know, there are complex languages around...
>
>> But yes, there is going to be a general section on "classes" in
>> l20n, and we should talk about people there, too.
>>
>> Note, I would expect the attribute name to be the Lithuanian name
>> for the grammar case, so if I had a German name in a database,
>> filled in with German grammar cases, I expect the attributes logic
>> to fallback to the main string, so I guess even my sample code
>> would return
>>
>> Axel Hecht dokumtai
>>
>> for that case. But I haven't checked, obviously.
>
> What if names of cases match for different languages? I would suggest
> using an ISO prefix for the language/locale in the attribute names then.
> Also, can attributes have non-ASCII letters in their names?..

I think we should resolve conflicts when we get there, and then find out
what that really means.

Regarding the grammar of IDs, currently I'm using \w+, but that's
probably wrong. It's likely something similar to
unicode_letter (unicode_letter | unicode_number | '_' )+

I'd rather not allow '-' or other math ops, and definitly not '.', as
that's for concating IDs to idrefs.

Axel
_______________________________________________
dev-i18n mailing list
dev-i18n@...
https://lists.mozilla.org/listinfo/dev-i18n