Re: what's the language of a document ?

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 - 4 - 5 | Next >

RE: what's the language of a document ?

by CE Whitehead :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.


Hi!  I am sorry; I reread Richard Ishida's post and I finally, finally went to the URL he posted and read about the meta element and the http headers in html 5
(in http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html#attr-meta-http-equiv), and I do not have a disagreement with Richard Ishida's post so much as a question!
 
Currently, and priot to html 5, the meta element, when it is identified as http-equiv is equivalent to the http header.  And is used the same way.  But not otherwise.
It seems however that in html 5 that the meta element specification of content-language is being done away with in favor of the html lang= tag; is that right??:
"Conformance checkers will include a warning if this pragma is used. Authors are encouraged to use the lang attribute instead."
And further,
"This pragma is not exactly equivalent to the HTTP Content-Language header, for instance it only supports one language. [HTTP]"

 
If so, this is the  second text is the text that I am objecting to; and perhaps both--for example, I'd prefer "Conformance Checkers" to simply warn people that the meta element is best used to specify the audience language not the text-processing language. 
Sorry that I had not read the draft and that I did not make myself clear originally!
From: Martin J. Dürst <duerst@...>
Date: Fri, 30 Oct 2009 11:38:29 +0900
Message-ID: <4AEA51A5.3080801@...>
To: CE Whitehead <cewcathar@...>
CC: ishida@..., ian@..., simonp@..., divya.manian@..., martin.kliehm@..., cowan@..., public-html@..., www-international@...
On 2009/10/30 3:47, CE Whitehead wrote:
> I personally tend to agree with Roy Fielding, John Cowan, and Tex Texin actually, and not with Martin and Richard Ishida because I regulary create documents in two languages (French-English; French-Old French); following Richard Ishida's recommendations in "Specifying Languages in XHTML and HTML Content," I list all the languages in the meta content tag (when I have access to it; because my documents are generally served from a locale I don't control, I don't have access to the http headers).  I still set the html language to one or the other when possible and then if I get time specify additional information in relevant elements).
I'm sorry, but can you please explain where Richard and I differ from
Roy/John/Tex?
Sorry!
the issue for me was Ian Hickson's comment (http://lists.w3.org/Archives/Public/www-international/2009OctDec/0023.html) that:
"I've updated the spec to say that when the higher-level protocol reports
multiple languages, they are all ignored in favour of the default
(unknown)."
I did reread Richard Ishida's post--and actually; he does not seem to say that we need to do away with all
differences between the http header, the meta content tag, and the html tag so I must have misread him the first time through--
in fact, there's nothing he said that I disagree with, so sorry.
>It could be that we have very minor differences of how we
>have expressed ourselves, but I think we all agree that HTML5 has to be
>changed to treat the Content-Language: HTTP response header and the
>corresponding <meta> "pragma" the same way.
Agreed
>> I think there will always be cases where people will not tag a document correctly; if a tag is needed it makes no sense to eliminate it because someone cannot yet use it properly.
>I have to say that I slightly prefer ignoring multiple values in
>Content-Language: or the corresponding "pragma" to taking the first
>value for the default language, but that's a minor issue.
Fine, if it's a minor issue . . .  and it's all up to the applications in the end anyway how they will handle things!
> And I think that Tex makes a point too--someone might specify a document language as fr-FR and fr-LU but not fr-CA and it makes no sense to default to unknown.
> . . .
> As for the "fr-FR and fr-LU but not fr-CA" example, using "fr" as a
> default may seem obvious to some, but then that would include "fr-CA",
> which the author actually didn't include. So just using "fr" would
> actually be wrong.
Agreed, Canadian French is unique.
--C. E. Whitehead
cewcathar@...
> Regards,    Martin.

--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   duerst@...
Received on Friday, 30 October 2009 02:39:

RE: what's the language of a document ?

by tex-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Re: [3] Establish the precedence between http vs meta. 

 

I wish we could eliminate this nonsense altogether.

The description of the content of a document should be self-contained within the document and not in the protocol.

The protocol should only ever reflect what is in the document to enable routing and filters etc.

But documents should be self-declared.

 

 

[1] Explain clearly that declarations in the http header and the meta element refer to the document as an object, rather than the text in a

specific element (this is what makes the distinction between single and multiple values sensible).

 

This is contrived.

There is no reason an element cannot contain sub-elements that are in different languages, so why force a single language description.

 

There is value to letting a processing agent know which languages are included so it can use appropriate rendering rules and have the right resources loaded (eg fonts) as opposed to having it run into a new language and react dynamically.

 

It is fine to declare that one language is a primary to establish the overall treatment of the text, or to be the default for text without a language declaration, but there is no reason to pretend elements are monolingual when they are not.

 

[4] Establish the rule that multiple values in the place that has precedence equates to lang="".

 

Why would you remove information that has been provided?

 

 

-----Original Message-----
From: www-international-request@... [mailto:www-international-request@...] On Behalf Of Richard Ishida
Sent: Thursday, October 29, 2009 11:11 AM
To: 'Ian Hickson'
Cc: 'Simon Pieters'; 'Divya Manian'; 'Martin Kliehm'; 'John Cowan'; public-html@...; www-international@...; '"Martin J. Dürst"'
Subject: RE: what's the language of a document ?

 

Personally, I agree with Martin here.  I have spent a long time trying

simplify explanations so that people can understand how to manage the

various different ways of declaring language in HTML (http vs meta vs lang;

html vs xhtml vs xml), and it really concerns me that I will now have to say

"But in html5 things are slightly different again".    It's already hard

enough to get people to declare language, and I think that the changes that

come with the current text in html5 will only make things worse by causing

further confusion. On the other hand, I think there may be a way to satisfy

everyone.

 

We discussed this during the Internationalization WG telecon last night, and

I was actioned to put the following to you and the HTML group on behalf of

the i18n WG.

 

 

Our proposal is as follows and is based on the text of the following

sections:

http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html#d

ocument-wide-default-language

http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#th

e-lang-and-xml:lang-attributes

 

 

[1] Explain clearly that declarations in the http header and the meta

element refer to the document as an object, rather than the text in a

specific element (this is what makes the distinction between single and

multiple values sensible).

 

[2] Continue to recommend that the document-wide default language be defined

by a lang attribute on the html tag, but say that if the lang attribute is

missing and there is a language defined in the http or meta, then those

language declarations can be used to guess the language of the text, if they

contain a single value.

 

[3] Establish the precedence between http vs meta. 

 

[4] Establish the rule that multiple values in the place that has precedence

equates to lang="".

 

This is very close to what we already have, but doesn't try to make the meta

declaration a different thing than the http declaration, or change it so

that multiple values are no longer valid.  At the same time, it allows

either the http or the meta to provide language information for

text-processing, if the declaration is useable.

 

We also feel that the spec seems to restrict the use of the term

'document-wide default language' to refer only to a language declared using

the meta, and this is rather odd.  We feel that in fact the lang attribute

on the html element also establishes a document-wide default language. (See

the text: "Until the pragma is successfully processed, there is no

document-wide default language.")

 

RI

 

PS: I could suggest some changes to the wording, if that helps.

 

 

============

Richard Ishida

Internationalization Lead

W3C (World Wide Web Consortium)

 

http://www.w3.org/International/

http://rishida.net/

 

 

 

 

> -----Original Message-----

> From: www-international-request@... [mailto:www-international-

> request@...] On Behalf Of "Martin J. Dürst"

> Sent: 27 October 2009 11:09

> To: Ian Hickson

> Cc: Simon Pieters; Divya Manian; Martin Kliehm; John Cowan; <public-

> html@...>; www-international@...

> Subject: Re: what's the language of a document ?

>

> On 2009/10/27 19:37, Ian Hickson wrote:

> > On Tue, 27 Oct 2009, Simon Pieters wrote:

> >> This doesn't match what's specced for<meta http-equiv=content-

> language

> >> content=foo,bar>.

> >

> > That's intentional, and is based on data about how people actually use

> > that pragma.

>

> There's always a way to justify inconsistent choices (be it browser

> implementations, 'data' about how people (who?) use some feature (at

> what point in time?),...). But it would be way better to be consistent.

>

> And there is always a way to justify making choices that everybody

> except those knowing all the details of the spec don't understand. But

> it would be way better to make choices that are easy to understand (e.g.

> http-equiv actually meaning what it says, namely "equivalent to the

> corresponding HTTP header").

>

> There are lots of cases where over time, people have come to a better

> understanding of how things work. For stuff that authors/producers

> aren't supposed to produce, I don't mind too much that HTML5 is

> hopelessly complex and inconsistent. I can live without remembering it

> all, and can tell others to avoid it. However, for stuff like the above,

> which may be used even by very consciously clean developers, creating

> inconsistencies such the above is a heavy negative legacy.

>

> Regards,   Martin.

>

> --

> #-# Martin J. Dürst, Professor, Aoyama Gakuin University

> #-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@...

 

 

 


RE: what's the language of a document ?

by CE Whitehead :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.


Hi.

 

From: textexin@...
To: ishida@...; ian@...
CC: simonp@...; divya.manian@...; martin.kliehm@...; cowan@...; public-html@...; www-international@...; duerst@...
Date: Sat, 31 Oct 2009 18:05:54 -0700
Subject: RE: what's the language of a document ?

 

 

>> [4] Establish the rule that multiple values in the place that has precedence equates to lang="".

 

> Why would you remove information that has been provided?

 

 O.k., I hope that multiple values equate to the first value or ideally to all (in the http and meta content-language headers only for the latter).  I think this is the point everyone has said is minor!   Best, C. E. Whitehead cewcathar@...

 


Re: what's the language of a document ?

by M.T. Carrasco Benitez :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This subject is really periodic like the flu -:) and I hope the HTML5 does not make it worse.

The subject should be "languages" in plural as a document could be multilingual, the processing language must be singular; and both could be undefined.

It has been repeatedly discussed:

http://lists.w3.org/Archives/Public/www-international/2008JulSe/0138.html
http://lists.w3.org/Archives/Public/www-international/2008JulSep/0142.html
http://www.w3.org/TR/1998/NOTE-html-lan-19980313

Regards
Tomas






RE: what's the language of a document ?

by Ian Hickson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


I've tried to update the spec to what was discussed with I18N at TPAC, in
particular regarding the way Content-Language is processed.

I ended up not making lang="" required or trigger a warning when it's
omitted, because it's quite plausible that a document will not have a
language at all, and because in many cases in practice language-detection
heuristics are actually more reliable than the lang="" attribute anyway.
However, if this isn't satisfactory, I would recommend bringing it up on
the public-html list for further discussion.

In response to further comments:

On Thu, 29 Oct 2009, Richard Ishida wrote:

>
> Our proposal is as follows and is based on the text of the following
> sections:
> http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html#d 
> ocument-wide-default-language
> http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#th 
> e-lang-and-xml:lang-attributes
>
> [1] Explain clearly that declarations in the http header and the meta
> element refer to the document as an object, rather than the text in a
> specific element (this is what makes the distinction between single and
> multiple values sensible).

Does the renaming of the term "document-wide default language" to
"pragma-set default language" address this sufficiently?


> [3] Establish the precedence between http vs meta.

I think this should now be clear.


> [4] Establish the rule that multiple values in the place that has
> precedence equates to lang="".

Done.


On Sat, 31 Oct 2009, Tex Texin wrote:
>
> Re: [3] Establish the precedence between http vs meta.  
>  
> I wish we could eliminate this nonsense altogether.
> The description of the content of a document should be self-contained within
> the document and not in the protocol.
> The protocol should only ever reflect what is in the document to enable
> routing and filters etc.
> But documents should be self-declared.

Content-Language is indeed unnecessary given lang="", but I would
recommend bringing this up with the HTTP group if the proposal is to
remove the header altogether.

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: what's the language of a document ?

by Silvia Pfeiffer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Feb 6, 2010 at 6:55 AM, Ian Hickson <ian@...> wrote:

>
> On Sat, 31 Oct 2009, Tex Texin wrote:
>>
>> Re: [3] Establish the precedence between http vs meta.
>>
>> I wish we could eliminate this nonsense altogether.
>> The description of the content of a document should be self-contained within
>> the document and not in the protocol.
>> The protocol should only ever reflect what is in the document to enable
>> routing and filters etc.
>> But documents should be self-declared.
>
> Content-Language is indeed unnecessary given lang="", but I would
> recommend bringing this up with the HTTP group if the proposal is to
> remove the header altogether.

This would work for several types of resources, e.g. html resources
and xml-based resources.

But there are many more mime types that get served over http which do
not declare their language inside the document and where an external
hint like this to the receiver will be helpful. I wouldn't act this
hastily with removing a HTTP header.

Regards,
Silvia.


RE: what's the language of a document ?

by CE Whitehead :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Hi.
 
 

From: Ian Hickson <ian@...>
Date: Fri, 5 Feb 2010 19:55:45 +0000 (UTC)


>> [4] Establish the rule that multiple values in the place that has
>> precedence equates to lang="".
>Done
 

I assume that this rule is only for interpreting the language of lower level elements with no language declared--right or no?
(See: http://www.w3.org/International/wiki/Htmlissue88 for where I get this idea )


 
 
> Content-Language is indeed unnecessary given lang="", but I would
> recommend bringing this up with the HTTP group if the proposal is to
> remove the header altogether.

 
Hmm; going back to Tex Texin's email from Oct 9:

 
"From: Tex Texin <textexin@...>
Date: Sat, 31 Oct 2009 18:05:54 -0700


> Re: [3] Establish the precedence between http vs meta.

> I wish we could eliminate this nonsense altogether.
> The description of the content of a document should be self-contained within
> the document and not in the protocol.
> The protocol should only ever reflect what is in the document to enable
> routing and filters etc.
> But documents should be self-declared."

 
I agree that the protocol is for routing and filters but I am not sure what Tex is saying here;
isn't this header needed so that if I request www.google.ca or www.msn.com and my language preference is set to French, then my page will be served in French?  (Maybe there is something I don't understand and maybe it's not needed here.)


 
In any case, if you remove these headers, how do you plan to handle documents with multiple target languages?
(for example, a page with Old French or Middle or other French texts with summaries or discussions of each in English--
in this case the target audience is someone who simultaneously reads Old or Middle French and modern English;
other documents are in two languages on a single page and targeting speakers from both--
for example, the many pages with the translation into a second language placed side-by-side the original
on the same page; and there may be some legal documents with texts in one language and dicussions in another)
 
Finally,
 
From: Silvia Pfeiffer <silviapfeiffer1@...>
Date: Sat, 6 Feb 2010 12:39:47 +1100

> Subject: Re: what's the language of a document ?
>
> On Sat, Feb 6, 2010 at 6:55 AM, Ian Hickson <ian@...> wrote:
> >
> > On Sat, 31 Oct 2009, Tex Texin wrote:
> >>
> >> Re: [3] Establish the precedence between http vs meta.
> >>
> >> I wish we could eliminate this nonsense altogether.
> >> The description of the content of a document should be self-contained within
> >> the document and not in the protocol.
> >> The protocol should only ever reflect what is in the document to enable
> >> routing and filters etc.
> >> But documents should be self-declared.
> >
> > Content-Language is indeed unnecessary given lang="", but I would
> > recommend bringing this up with the HTTP group if the proposal is to
> > remove the header altogether.
>
> This would work for several types of resources, e.g. html resources
> and xml-based resources.
>
> But there are many more mime types that get served over http which do
> not declare their language inside the document and where an external
> hint like this to the receiver will be helpful. I wouldn't act this
> hastily with removing a HTTP header.
>
> Regards,
> Silvia.
>
I agree with Sylvia.
 
Best,
 
C. E. Whitehead
cewcathar@...

RE: what's the language of a document ?

by Andrew Cunningham :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Sat, February 6, 2010 06:55, Ian Hickson wrote:

>
> I've tried to update the spec to what was discussed with I18N at TPAC, in
> particular regarding the way Content-Language is processed.
>
> I ended up not making lang="" required or trigger a warning when it's
> omitted, because it's quite plausible that a document will not have a
> language at all, and because in many cases in practice language-detection
> heuristics are actually more reliable than the lang="" attribute anyway.
> However, if this isn't satisfactory, I would recommend bringing it up on
> the public-html list for further discussion.
>

sounds overly optimistic. In practice language-detection only supports a
small number of languages with any reliability. And i seriously doubt that
web browser developers would want to include language detection,
considering the overhead that an extensible language detection system
would require. And the amount of on going work to implement new languages
in the detection support.

I think I could throw together 100 pages, each in a different language,
use language detection libraries on them, and get a 0% detection rate. ;)

Obviously I could also select a range of languages and get close to 100%
detection rate.

Also I'd suggest there are instances where lang is very useful. In
particular CJK data, where web browsers tend to select fonts based on
language declaration, in absence of appropriate styling.

The CSS3 people are currently discussing CSS support for more advanced
OpenType support within CSS3 Fonts module. If this eventuates, then
language tagging could be used to trigger language rendering available in
an opentype font.

lang="" could be required or not required. but language detection is a
poor reason for deciding.

Andrew

--
Andrew Cunningham
Research and Development Coordinator
Vicnet
State Library of Victoria
Australia

andrewc@...



Re: what's the language of a document ?

by Aryeh Gregor-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Feb 6, 2010 at 8:55 PM, Andrew Cunningham <andrewc@...> wrote:

> sounds overly optimistic. In practice language-detection only supports a
> small number of languages with any reliability. . . .
>
> Also I'd suggest there are instances where lang is very useful. In
> particular CJK data, where web browsers tend to select fonts based on
> language declaration, in absence of appropriate styling.
>
> The CSS3 people are currently discussing CSS support for more advanced
> OpenType support within CSS3 Fonts module. If this eventuates, then
> language tagging could be used to trigger language rendering available in
> an opentype font.
>
> lang="" could be required or not required. but language detection is a
> poor reason for deciding.

I think it's fair to say that right now, <html lang> is not so
uniformly useful that authors need to be warned if they omit it.  If
the *average* page (some random hit from Google, say) doesn't have any
use for it, then I don't think it should raise a warning -- warnings
should be useful to most authors, not only a small minority.
Otherwise you're making warnings as a whole less useful.


Re: what's the language of a document ?

by Andrew Cunningham :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Aryeh,

On Mon, February 8, 2010 05:01, Aryeh Gregor wrote:
> On Sat, Feb 6, 2010 at 8:55 PM, Andrew Cunningham <andrewc@...>
> wrote:

>> lang="" could be required or not required. but language detection is a
>> poor reason for deciding.
>
> I think it's fair to say that right now, <html lang> is not so
> uniformly useful that authors need to be warned if they omit it.  If
> the *average* page (some random hit from Google, say) doesn't have any
> use for it, then I don't think it should raise a warning -- warnings
> should be useful to most authors, not only a small minority.
> Otherwise you're making warnings as a whole less useful.
>

I did say "could be required or not required". I have no preference either
way ;) The point of the post was

1) in certain situations language tagging can be critical, and if css3
fonts module goes the right way, the number of languages it is necessary
for will increase.

2) the language detection argument is an extremely poor reason for
deciding one way or other.

I suppose a few things are happening here, you propose that most authors
don't use it and don't need it in most cases, As a consequence warnings
aren't needed.

I suppose is a philosophical position.

I believe that if not generally, then at least in specific circumstances,
it is necessary. I don't find your argument convincing, if you used that
argument on language tagging you could use a similar methodology on all
other HTML tags. Although, after some thought i would probably agree that
a warning isn't necessary.

But I think I'm coming to the understanding that the HTML specs, the CSS
specs and validators rarely tell you all you need to know to create a web
page properly.

lets see if I can explain that in a practical sense.  I'm doing more and
more work with Burmese and S'gaw Karen at the moment. And probably teh cux
of the web development job is knowing

1) which HTML elements never to use
2) which HTML elements you need to do a complete overwrite of the default
presentation'
3) limitations in web browsers' font rendering and OpenType support

--
Andrew Cunningham
Research and Development Coordinator
Vicnet
State Library of Victoria
Australia

andrewc@...



Re: what's the language of a document ?

by Henri Sivonen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Feb 7, 2010, at 20:01, Aryeh Gregor wrote:

> I think it's fair to say that right now, <html lang> is not so
> uniformly useful that authors need to be warned if they omit it.  If
> the *average* page (some random hit from Google, say) doesn't have any
> use for it, then I don't think it should raise a warning -- warnings
> should be useful to most authors, not only a small minority.
> Otherwise you're making warnings as a whole less useful.

Moreover, making validators emit a message (of any kind) about the absence of a language declaration is likely to lead to authoring tools putting in a placeholder in order to silence validators. As a result, at least "en" and "en-US" can often be taken to mean "placeholder".

--
Henri Sivonen
hsivonen@...
http://hsivonen.iki.fi/




Re: what's the language of a document ?

by John Cowan :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Henri Sivonen scripsit:

> Moreover, making validators emit a message (of any kind) about the
> absence of a language declaration is likely to lead to authoring tools
> putting in a placeholder in order to silence validators. As a result,
> at least "en" and "en-US" can often be taken to mean "placeholder".

Quite so.  Google, at least, explicitly disregards "en"-based language
tags when determining the language of a web page for search purposes.
Other language tags are accepted as evidence, but are not treated as
determinative.

--
John Cowan    http://ccil.org/~cowan    cowan@...
Economists were put on this planet to make astrologers look good.
        --Leo McGarry


RE: what's the language of a document ?

by Richard Ishida :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Ian,

Did you see http://www.w3.org/International/wiki/Htmlissue88 ? (Which has
been submitted and was moved forward a notch at the HTML telecon last week).
I'm just asking so that I know whether you took those suggestions into
account when I read what you ended up writing - I noticed a couple of
proposed changes there that you didn't do.

I was about to write that I'm as happy with 'pragma-set default language' as
with 'Content-Language pragma language', but on reflection, your proposal
does still sound like the pragma may have set the default, whereas it
wouldn't do so if there were an attribute on the html tag.  (I know that CL
pragma language is rather ugly though.)

Cheers,
RI


============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/




> -----Original Message-----
> From: public-html-request@... [mailto:public-html-request@...] On
> Behalf Of Ian Hickson
> Sent: 05 February 2010 19:56
> To: www-international@...
> Cc: <public-html@...>
> Subject: RE: what's the language of a document ?
>
>
> I've tried to update the spec to what was discussed with I18N at TPAC, in
> particular regarding the way Content-Language is processed.
>
> I ended up not making lang="" required or trigger a warning when it's
> omitted, because it's quite plausible that a document will not have a
> language at all, and because in many cases in practice language-detection
> heuristics are actually more reliable than the lang="" attribute anyway.
> However, if this isn't satisfactory, I would recommend bringing it up on
> the public-html list for further discussion.
>
> In response to further comments:
>
> On Thu, 29 Oct 2009, Richard Ishida wrote:
> >
> > Our proposal is as follows and is based on the text of the following
> > sections:
> > http://www.whatwg.org/specs/web-apps/current-
> work/multipage/semantics.html#d
> > ocument-wide-default-language
> > http://www.whatwg.org/specs/web-apps/current-
> work/multipage/elements.html#th
> > e-lang-and-xml:lang-attributes
> >
> > [1] Explain clearly that declarations in the http header and the meta
> > element refer to the document as an object, rather than the text in a
> > specific element (this is what makes the distinction between single and
> > multiple values sensible).
>
> Does the renaming of the term "document-wide default language" to
> "pragma-set default language" address this sufficiently?
>
>
> > [3] Establish the precedence between http vs meta.
>
> I think this should now be clear.
>
>
> > [4] Establish the rule that multiple values in the place that has
> > precedence equates to lang="".
>
> Done.
>
>
> On Sat, 31 Oct 2009, Tex Texin wrote:
> >
> > Re: [3] Establish the precedence between http vs meta.
> >
> > I wish we could eliminate this nonsense altogether.
> > The description of the content of a document should be self-contained
> within
> > the document and not in the protocol.
> > The protocol should only ever reflect what is in the document to enable
> > routing and filters etc.
> > But documents should be self-declared.
>
> Content-Language is indeed unnecessary given lang="", but I would
> recommend bringing this up with the HTTP group if the proposal is to
> remove the header altogether.
>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'




RE: what's the language of a document ?

by Ian Hickson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, 8 Feb 2010, Richard Ishida wrote:
>
> Did you see http://www.w3.org/International/wiki/Htmlissue88 ? (Which has
> been submitted and was moved forward a notch at the HTML telecon last week).

I did not. (FWIW, I was told by the HTMLWG chairs that the telecons were
just status update meetings, so as I understand it nothing can actually
move at all during those meetings.)


> I'm just asking so that I know whether you took those suggestions into
> account when I read what you ended up writing - I noticed a couple of
> proposed changes there that you didn't do.

That's possible; I was going from my notes at the F2F. Please feel free to
file bugs for any remaining issues if you haven't already, and I'll make
sure to go through them carefully.


> I was about to write that I'm as happy with 'pragma-set default
> language' as with 'Content-Language pragma language', but on reflection,
> your proposal does still sound like the pragma may have set the default,
> whereas it wouldn't do so if there were an attribute on the html tag.  
> (I know that CL pragma language is rather ugly though.)

I don't understand... doesn't the pragma in fact set the default?

--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


ISSUE-88 / Re: what's the language of a document ?

by Maciej Stachowiak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Richard & Ian,

On Feb 8, 2010, at 12:18 PM, Richard Ishida wrote:

> Hi Ian,
>
> Did you see http://www.w3.org/International/wiki/Htmlissue88 ? (Which has
> been submitted and was moved forward a notch at the HTML telecon last week).
> I'm just asking so that I know whether you took those suggestions into
> account when I read what you ended up writing - I noticed a couple of
> proposed changes there that you didn't do.
>
> I was about to write that I'm as happy with 'pragma-set default language' as
> with 'Content-Language pragma language', but on reflection, your proposal
> does still sound like the pragma may have set the default, whereas it
> wouldn't do so if there were an attribute on the html tag.  (I know that CL
> pragma language is rather ugly though.)

Since Ian seems willing to put in the requested changes, could the two of you please check which items from the Change Proposal are not covered, and which are essential?

Thanks
Maciej



RE: ISSUE-88 / Re: what's the language of a document ?

by Richard Ishida :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ian,

Are you ok to apply the points in
http://www.w3.org/International/wiki/Htmlissue88 to the spec?
       
Cheers,
RI

============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/




> -----Original Message-----
> From: Maciej Stachowiak [mailto:mjs@...]
> Sent: 09 February 2010 01:35
> To: Richard Ishida
> Cc: 'Ian Hickson'; www-international@...; public-html@...
> Subject: ISSUE-88 / Re: what's the language of a document ?
>
>
> Richard & Ian,
>
> On Feb 8, 2010, at 12:18 PM, Richard Ishida wrote:
>
> > Hi Ian,
> >
> > Did you see http://www.w3.org/International/wiki/Htmlissue88 ? (Which
has
> > been submitted and was moved forward a notch at the HTML telecon last
> week).
> > I'm just asking so that I know whether you took those suggestions into
> > account when I read what you ended up writing - I noticed a couple of
> > proposed changes there that you didn't do.
> >
> > I was about to write that I'm as happy with 'pragma-set default
language'
> as
> > with 'Content-Language pragma language', but on reflection, your
> proposal
> > does still sound like the pragma may have set the default, whereas it
> > wouldn't do so if there were an attribute on the html tag.  (I know that
CL
> > pragma language is rather ugly though.)
>
> Since Ian seems willing to put in the requested changes, could the two of
you
> please check which items from the Change Proposal are not covered, and
> which are essential?
>
> Thanks
> Maciej




RE: what's the language of a document ?

by Richard Ishida :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> > I was about to write that I'm as happy with 'pragma-set default
> > language' as with 'Content-Language pragma language', but on reflection,
> > your proposal does still sound like the pragma may have set the default,
> > whereas it wouldn't do so if there were an attribute on the html tag.
> > (I know that CL pragma language is rather ugly though.)
>
> I don't understand... doesn't the pragma in fact set the default?

If by 'default language' we mean 'the language of the page as a whole', I
guess it can do, unless html lang="..." is being used (in which case the
lang attribute sets the default). It may be that I'm arguing too fine a
point here.

RI

============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/




> -----Original Message-----
> From: Ian Hickson [mailto:ian@...]
> Sent: 08 February 2010 23:12
> To: Richard Ishida
> Cc: www-international@...; public-html@...
> Subject: RE: what's the language of a document ?
>
> On Mon, 8 Feb 2010, Richard Ishida wrote:
> >
> > Did you see http://www.w3.org/International/wiki/Htmlissue88 ? (Which
has

> > been submitted and was moved forward a notch at the HTML telecon last
> week).
>
> I did not. (FWIW, I was told by the HTMLWG chairs that the telecons were
> just status update meetings, so as I understand it nothing can actually
> move at all during those meetings.)
>
>
> > I'm just asking so that I know whether you took those suggestions into
> > account when I read what you ended up writing - I noticed a couple of
> > proposed changes there that you didn't do.
>
> That's possible; I was going from my notes at the F2F. Please feel free to
> file bugs for any remaining issues if you haven't already, and I'll make
> sure to go through them carefully.
>
>
> > I was about to write that I'm as happy with 'pragma-set default
> > language' as with 'Content-Language pragma language', but on reflection,
> > your proposal does still sound like the pragma may have set the default,
> > whereas it wouldn't do so if there were an attribute on the html tag.
> > (I know that CL pragma language is rather ugly though.)
>
> I don't understand... doesn't the pragma in fact set the default?
>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



RE: what's the language of a document ?

by tex-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I don't think it is a fine point Richard, keeping clear on the precedence is
important.
thanks
tex

-----Original Message-----
From: www-international-request@...
[mailto:www-international-request@...] On Behalf Of Richard Ishida
Sent: Tuesday, February 09, 2010 2:59 AM
To: 'Ian Hickson'
Cc: www-international@...; public-html@...
Subject: RE: what's the language of a document ?

> > I was about to write that I'm as happy with 'pragma-set default
> > language' as with 'Content-Language pragma language', but on reflection,
> > your proposal does still sound like the pragma may have set the default,
> > whereas it wouldn't do so if there were an attribute on the html tag.
> > (I know that CL pragma language is rather ugly though.)
>
> I don't understand... doesn't the pragma in fact set the default?

If by 'default language' we mean 'the language of the page as a whole', I
guess it can do, unless html lang="..." is being used (in which case the
lang attribute sets the default). It may be that I'm arguing too fine a
point here.

RI

============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/




> -----Original Message-----
> From: Ian Hickson [mailto:ian@...]
> Sent: 08 February 2010 23:12
> To: Richard Ishida
> Cc: www-international@...; public-html@...
> Subject: RE: what's the language of a document ?
>
> On Mon, 8 Feb 2010, Richard Ishida wrote:
> >
> > Did you see http://www.w3.org/International/wiki/Htmlissue88 ? (Which
has

> > been submitted and was moved forward a notch at the HTML telecon last
> week).
>
> I did not. (FWIW, I was told by the HTMLWG chairs that the telecons were
> just status update meetings, so as I understand it nothing can actually
> move at all during those meetings.)
>
>
> > I'm just asking so that I know whether you took those suggestions into
> > account when I read what you ended up writing - I noticed a couple of
> > proposed changes there that you didn't do.
>
> That's possible; I was going from my notes at the F2F. Please feel free to
> file bugs for any remaining issues if you haven't already, and I'll make
> sure to go through them carefully.
>
>
> > I was about to write that I'm as happy with 'pragma-set default
> > language' as with 'Content-Language pragma language', but on reflection,
> > your proposal does still sound like the pragma may have set the default,
> > whereas it wouldn't do so if there were an attribute on the html tag.
> > (I know that CL pragma language is rather ugly though.)
>
> I don't understand... doesn't the pragma in fact set the default?
>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'




RE: ISSUE-88 / Re: what's the language of a document ?

by Ian Hickson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 9 Feb 2010, Richard Ishida wrote:
>
> Are you ok to apply the points in
> http://www.w3.org/International/wiki/Htmlissue88 to the spec?

>From that document:

| [1] Replace the term 'document-wide default language' with the term
| 'Content-Language pragma language'.

The spec currently uses the term "pragma-set default language".


| [2] [...] clarify why the HTTP and pragma declarations are different
| when it comes to values, and how they should be used

The confusion is intended to be clarified by simply discouraging authors
from using the pragma at all.

The proposed text:

| Note: Declarations in the HTTP header and the Content Language pragma
| are metadata, referring to the document as a whole and expressing the
| expected language or languages of the audience of the document.
| A language attribute on an element describes the actual language used in
| the range of content bounded by that element (and so values are limited
| to a single language at a time).

...seems to just muddy the waters further. Per HTTP, the Content-Langauge
HTTP header is supposed to say what languages the document is intended
for, and doesn't say anything about the contents of the document. The
pragma, on the other hand, just sets the default language of the page. The
pragra really has more in common with the attribute than the header, in
terms of actual practical effect.

I'm certainly open to adding more disambiguating text, but I think it
would be helpful to have some pointers to e-mails showing the confusion so
that a more directed disambiguation could be crafted.


| [3] [allow the pragma to have more than one value, because] There is
| consensus that the current syntax should not be changed, and that it
| should be possible to continue to specify multiple languages in the
| pragma.

I disagree that there's consensus here. I don't understand the value of
allowing authors to specify values that are going to be ignored by
processors.


| [4] Remove 'primary' from:
|
| "The lang attribute (in no namespace) specifies the primary language for
| the element's contents and for any of the element's attributes that
| contain text. Its value must be a valid BCP 47 language code, or the
| empty string. [BCP47]"
|
| Rationale:
|
| Only one language can be declared at a time.

Only one language can be _declared_ at a time, but that doesn't mean only
one language is actually contained in the element.


| [5] [...] If the pragma attribute contains a comma-separated list of
| languages, it cannot be determined with any degree of certainty which of
| the languages matches the content of the text.

This was handled by changing the UA requirements of the pragma.


I recommend going through the normal process for these, by the way (using
bugs and so forth) rather than jumping straight to the Change Proposal
stage. It will help ensure that we keep issues focused.

Cheers,
--
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: ISSUE-88 / Re: what's the language of a document ?

by Maciej Stachowiak :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Richard, thoughts on this response? Do you think further changes are  
needed on any of these points?

(To my casual reading, it seems like point 3 was the most clearly  
rejected and is the most directly related to <http://www.w3.org/Bugs/Public/show_bug.cgi?id=8088 
 >, the bug that was originally filed, rejected and escalated,  
resulting in this tracker issue.)

Regards,
Maciej

On Feb 21, 2010, at 6:11 PM, Ian Hickson wrote:

> On Tue, 9 Feb 2010, Richard Ishida wrote:
>>
>> Are you ok to apply the points in
>> http://www.w3.org/International/wiki/Htmlissue88 to the spec?
>
>> From that document:
>
> | [1] Replace the term 'document-wide default language' with the term
> | 'Content-Language pragma language'.
>
> The spec currently uses the term "pragma-set default language".
>
>
> | [2] [...] clarify why the HTTP and pragma declarations are different
> | when it comes to values, and how they should be used
>
> The confusion is intended to be clarified by simply discouraging  
> authors
> from using the pragma at all.
>
> The proposed text:
>
> | Note: Declarations in the HTTP header and the Content Language  
> pragma
> | are metadata, referring to the document as a whole and expressing  
> the
> | expected language or languages of the audience of the document.
> | A language attribute on an element describes the actual language  
> used in
> | the range of content bounded by that element (and so values are  
> limited
> | to a single language at a time).
>
> ...seems to just muddy the waters further. Per HTTP, the Content-
> Langauge
> HTTP header is supposed to say what languages the document is intended
> for, and doesn't say anything about the contents of the document. The
> pragma, on the other hand, just sets the default language of the  
> page. The
> pragra really has more in common with the attribute than the header,  
> in
> terms of actual practical effect.
>
> I'm certainly open to adding more disambiguating text, but I think it
> would be helpful to have some pointers to e-mails showing the  
> confusion so
> that a more directed disambiguation could be crafted.
>
>
> | [3] [allow the pragma to have more than one value, because] There is
> | consensus that the current syntax should not be changed, and that it
> | should be possible to continue to specify multiple languages in the
> | pragma.
>
> I disagree that there's consensus here. I don't understand the value  
> of
> allowing authors to specify values that are going to be ignored by
> processors.
>
>
> | [4] Remove 'primary' from:
> |
> | "The lang attribute (in no namespace) specifies the primary  
> language for
> | the element's contents and for any of the element's attributes that
> | contain text. Its value must be a valid BCP 47 language code, or the
> | empty string. [BCP47]"
> |
> | Rationale:
> |
> | Only one language can be declared at a time.
>
> Only one language can be _declared_ at a time, but that doesn't mean  
> only
> one language is actually contained in the element.
>
>
> | [5] [...] If the pragma attribute contains a comma-separated list of
> | languages, it cannot be determined with any degree of certainty  
> which of
> | the languages matches the content of the text.
>
> This was handled by changing the UA requirements of the pragma.
>
>
> I recommend going through the normal process for these, by the way  
> (using
> bugs and so forth) rather than jumping straight to the Change Proposal
> stage. It will help ensure that we keep issues focused.
>
> Cheers,
> --
> Ian Hickson               U+1047E                )
> \._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _
> \  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--
> (,_..'`-.;.'
>


< Prev | 1 - 2 - 3 - 4 - 5 | Next >