|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 | Next > |
|
|
CR and LF in chunk extension valuesHi,
A chunk extension value is defined as either token or quoted-string. A quoted-string allows CRs and LFs for folding and in escaped form under RFC 2616; we have since outlawed the escaped form, and in headers, but not chunk extension values, we now outlaw producing them for folding as- well. Accepting and processing the latter correctly still appears to be a SHOULD level requirement; I am not sure about the former. It appears that implementations usually just read a line and ignore any- thing after the first ";" character at the beginning of a chunk. Perhaps the specification should use a CRLF-free quoted-string instead for this; if not, the considerations for obs-fold should apply to chunk extension values aswell, or obs-fold should not be used for chunk extension values (which would require a separate quoted-string production aswell). regards, -- Björn Höhrmann · mailto:bjoern@... · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ |
|
|
Re: CR and LF in chunk extension valuesNow #173:
http://trac.tools.ietf.org/wg/httpbis/trac/ticket/173 We probably need to have a more general discussion of chunk-extensions as well... On 18/06/2009, at 4:07 AM, Bjoern Hoehrmann wrote: > Hi, > > A chunk extension value is defined as either token or quoted- > string. A > quoted-string allows CRs and LFs for folding and in escaped form under > RFC 2616; we have since outlawed the escaped form, and in headers, but > not chunk extension values, we now outlaw producing them for folding > as- > well. Accepting and processing the latter correctly still appears to > be > a SHOULD level requirement; I am not sure about the former. > > It appears that implementations usually just read a line and ignore > any- > thing after the first ";" character at the beginning of a chunk. > Perhaps > the specification should use a CRLF-free quoted-string instead for > this; > if not, the considerations for obs-fold should apply to chunk > extension > values aswell, or obs-fold should not be used for chunk extension > values > (which would require a separate quoted-string production aswell). > > regards, > -- > Björn Höhrmann · mailto:bjoern@... · http://bjoern.hoehrmann.de > Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de > 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http:// > www.websitedev.de/ > -- Mark Nottingham http://www.mnot.net/ |
|
|
Re: CR and LF in chunk extension valuesBjoern Hoehrmann wrote:
> Hi, > > A chunk extension value is defined as either token or quoted-string. A > quoted-string allows CRs and LFs for folding and in escaped form under > RFC 2616; we have since outlawed the escaped form, and in headers, but > not chunk extension values, we now outlaw producing them for folding as- > well. Accepting and processing the latter correctly still appears to be > a SHOULD level requirement; I am not sure about the former. Hmm. I had no idea line folding was allowed inside a quoted-string, and I expect I'm not the only one. That's quite a surprise. -- Jamie |
|
|
#173: CR and LF in chunk extension valuesThis was discussed in Stockholm, and there was agreement in the room
that the proper way to address this is to disallow CR and LF in *any* quoted-string. Comments? On 25/06/2009, at 3:53 PM, Mark Nottingham wrote: > Now #173: > http://trac.tools.ietf.org/wg/httpbis/trac/ticket/173 > > We probably need to have a more general discussion of chunk- > extensions as well... > > > On 18/06/2009, at 4:07 AM, Bjoern Hoehrmann wrote: > >> Hi, >> >> A chunk extension value is defined as either token or quoted- >> string. A >> quoted-string allows CRs and LFs for folding and in escaped form >> under >> RFC 2616; we have since outlawed the escaped form, and in headers, >> but >> not chunk extension values, we now outlaw producing them for >> folding as- >> well. Accepting and processing the latter correctly still appears >> to be >> a SHOULD level requirement; I am not sure about the former. >> >> It appears that implementations usually just read a line and ignore >> any- >> thing after the first ";" character at the beginning of a chunk. >> Perhaps >> the specification should use a CRLF-free quoted-string instead for >> this; >> if not, the considerations for obs-fold should apply to chunk >> extension >> values aswell, or obs-fold should not be used for chunk extension >> values >> (which would require a separate quoted-string production aswell). >> >> regards, >> -- >> Björn Höhrmann · mailto:bjoern@... · http://bjoern.hoehrmann.de >> Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de >> 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ >> > > > -- > Mark Nottingham http://www.mnot.net/ > > -- Mark Nottingham http://www.mnot.net/ |
|
|
Re: #173: CR and LF in chunk extension valuestis 2009-08-11 klockan 05:31 +1000 skrev Mark Nottingham:
> This was discussed in Stockholm, and there was agreement in the room > that the proper way to address this is to disallow CR and LF in *any* > quoted-string. > > Comments? Escaped newlines or \0 characters in the form of quoted-pair very likely to cause many parsers to fail no matter where these are seen. I know I have always understood this as a mechanism intended for quoting special characters like " ( and ), and not including CTLs. Regarding chunked encoding allowing any newlines there is a very very bad idea. Folding is not supported there, and no one expects to see newlines in the middle of a chunk header quoted or not. I would propose changing quoted-pair to restrict the allowable set to non-CTLs to match most expectations on what values may be seen, not only excluding CR or LF. quoted-pair = "\" <any CHAR except CTLs> instead of quoted-pair = "\" CHAR Regards Henrik |
|
|
Re: #173: CR and LF in chunk extension valuesRight now, it's defined as:
> A string of text is parsed as a single word if it is quoted using > double-quote marks. > > quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE > qdtext = OWS / %x21 / %x23-5B / %x5D-7E / obs-text > ; OWS / <VCHAR except DQUOTE and "\"> / obs-text > obs-text = %x80-FF > > The backslash character ("\") MAY be used as a single-character > quoting mechanism only within quoted-string and comment constructs. > > quoted-text = %x01-09 / > %x0B-0C / > %x0E-FF ; Characters excluding NUL, CR and LF > quoted-pair = "\" quoted-text So it seems like we need to: 1) Consider removing OWS from qdtext, replacing it with space and tab only. While we could use BWS here, receivers are required to accept it, which I don't think is the desired effect. And, 2) Consider removing obs-text from qdtext, as it's a hole that a truck can drive through. Otherwise, modify it to explicitly disallow CTLs. And, 3) Restrict the allowable set of characters in quoted-text to disallow CTLs. VCHAR? On 11/08/2009, at 8:50 AM, Henrik Nordstrom wrote: > tis 2009-08-11 klockan 05:31 +1000 skrev Mark Nottingham: >> This was discussed in Stockholm, and there was agreement in the room >> that the proper way to address this is to disallow CR and LF in *any* >> quoted-string. >> >> Comments? > > Escaped newlines or \0 characters in the form of quoted-pair very > likely > to cause many parsers to fail no matter where these are seen. I know I > have always understood this as a mechanism intended for quoting > special > characters like " ( and ), and not including CTLs. > > Regarding chunked encoding allowing any newlines there is a very very > bad idea. Folding is not supported there, and no one expects to see > newlines in the middle of a chunk header quoted or not. > > I would propose changing quoted-pair to restrict the allowable set to > non-CTLs to match most expectations on what values may be seen, not > only > excluding CR or LF. > > quoted-pair = "\" <any CHAR except CTLs> > > instead of > > quoted-pair = "\" CHAR > > Regards > Henrik > -- Mark Nottingham http://www.mnot.net/ |
|
|
Re: #173: CR and LF in chunk extension valuesThat leaves us at:
1) Replace OWS in qdtext with space and tab, and 2) Remove obs-text from qdtext, and 3) Restrict quoted-text to VCHAR. Milestone assigned for -08; barring any other discussion, we'll see what the editors come up with in that revision. On 12/08/2009, at 4:43 PM, Mark Nottingham wrote: > Right now, it's defined as: > >> A string of text is parsed as a single word if it is quoted using >> double-quote marks. >> >> quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE >> qdtext = OWS / %x21 / %x23-5B / %x5D-7E / obs-text >> ; OWS / <VCHAR except DQUOTE and "\"> / obs-text >> obs-text = %x80-FF >> >> The backslash character ("\") MAY be used as a single-character >> quoting mechanism only within quoted-string and comment constructs. >> >> quoted-text = %x01-09 / >> %x0B-0C / >> %x0E-FF ; Characters excluding NUL, CR and LF >> quoted-pair = "\" quoted-text > > So it seems like we need to: > > 1) Consider removing OWS from qdtext, replacing it with space and > tab only. While we could use BWS here, receivers are required to > accept it, which I don't think is the desired effect. And, > > 2) Consider removing obs-text from qdtext, as it's a hole that a > truck can drive through. Otherwise, modify it to explicitly disallow > CTLs. And, > > 3) Restrict the allowable set of characters in quoted-text to > disallow CTLs. VCHAR? > > > > On 11/08/2009, at 8:50 AM, Henrik Nordstrom wrote: > >> tis 2009-08-11 klockan 05:31 +1000 skrev Mark Nottingham: >>> This was discussed in Stockholm, and there was agreement in the room >>> that the proper way to address this is to disallow CR and LF in >>> *any* >>> quoted-string. >>> >>> Comments? >> >> Escaped newlines or \0 characters in the form of quoted-pair very >> likely >> to cause many parsers to fail no matter where these are seen. I >> know I >> have always understood this as a mechanism intended for quoting >> special >> characters like " ( and ), and not including CTLs. >> >> Regarding chunked encoding allowing any newlines there is a very very >> bad idea. Folding is not supported there, and no one expects to see >> newlines in the middle of a chunk header quoted or not. >> >> I would propose changing quoted-pair to restrict the allowable set to >> non-CTLs to match most expectations on what values may be seen, not >> only >> excluding CR or LF. >> >> quoted-pair = "\" <any CHAR except CTLs> >> >> instead of >> >> quoted-pair = "\" CHAR >> >> Regards >> Henrik >> > > > -- > Mark Nottingham http://www.mnot.net/ > > -- Mark Nottingham http://www.mnot.net/ |
|
|
Re: #173: CR and LF in chunk extension valuesMark Nottingham wrote:
> That leaves us at: > > 1) Replace OWS in qdtext with space and tab, and > 2) Remove obs-text from qdtext, and > 3) Restrict quoted-text to VCHAR. > > Milestone assigned for -08; barring any other discussion, we'll see what > the editors come up with in that revision. > ... 1) qdtext = WSP / %x21 / %x23-5B / %x5D-7E / obs-text ; WSP / <VCHAR except DQUOTE and "\"> / obs-text obs-text = %x80-FF 2) What's the problem with obs-text? It doesn't contain controls... 3) It seems to me that the purpose of quoted-text is to allow any character in qdtext, plus DQUOTE and "\", which would make it quoted-text = qdtext / DQUOTE / "\" While we're at it, we probably should rename it to quoted-char, and also add a short statement what the semantics of a quoted-pair is. BR, Julian |
|
|
Re: #173: CR and LF in chunk extension valuesOn 24/08/2009, at 11:18 PM, Julian Reschke wrote: > Mark Nottingham wrote: >> That leaves us at: >> 1) Replace OWS in qdtext with space and tab, and >> 2) Remove obs-text from qdtext, and >> 3) Restrict quoted-text to VCHAR. >> Milestone assigned for -08; barring any other discussion, we'll see >> what the editors come up with in that revision. >> ... > > 1) > > qdtext = WSP / %x21 / %x23-5B / %x5D-7E / obs-text > ; WSP / <VCHAR except DQUOTE and "\"> / obs-text > obs-text = %x80-FF Looks good. > 2) > > What's the problem with obs-text? It doesn't contain controls... Mea culpa; misread that. Never mind #2. > 3) > > It seems to me that the purpose of quoted-text is to allow any > character in qdtext, plus DQUOTE and "\", which would make it > > quoted-text = qdtext / DQUOTE / "\" > > While we're at it, we probably should rename it to quoted-char, and > also add a short statement what the semantics of a quoted-pair is. I had to read that a few times, but I think I agree. However, "quoted- char" may be confusing, as it's very similar to "quoted-pair". -- Mark Nottingham http://www.mnot.net/ |
|
|
Re: #173: CR and LF in chunk extension valuesMark Nottingham wrote:
> ... >> 3) >> >> It seems to me that the purpose of quoted-text is to allow any >> character in qdtext, plus DQUOTE and "\", which would make it >> >> quoted-text = qdtext / DQUOTE / "\" >> >> While we're at it, we probably should rename it to quoted-char, and >> also add a short statement what the semantics of a quoted-pair is. > > > I had to read that a few times, but I think I agree. However, > "quoted-char" may be confusing, as it's very similar to "quoted-pair". And yes, qdtext / DQUOTE / "\" is the same as WSP / VCHAR / obs-text ...but I think the former is more clear in that it adds DQUOTE and "\". But. quoted-pair is also used in comments. Are we ok with restricting the set here as well? And, if yes, shouldn't we then also adjust the allowed set for non-quoted characters in comments? Currently it reads (<http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-latest.html#rfc.section.3.2>): comment = "(" *( ctext / quoted-pair / comment ) ")" ctext = OWS / %x21-27 / %x2A-5B / %x5D-7E / obs-text ; OWS / <VCHAR except "(", ")", and "\"> / obs-text To make it consistent with quoted-string it would need to change to: ctext = BWS / %x21-27 / %x2A-5B / %x5D-7E / obs-text ; BWS / <VCHAR except "(", ")", and "\"> / obs-text Feedback appreciated, Julian |
|
|
Re: #173: CR and LF in chunk extension valuesShould probably change topic here, but it's still relevant so keeping
the issue topic. Most of this is taking a more generic view of quoted-pair, not isolated to chunk extension values. tis 2009-08-25 klockan 09:11 +0200 skrev Julian Reschke: > quoted-pair is also used in comments. Are we ok with restricting the set > here as well? And, if yes, shouldn't we then also adjust the allowed set > for non-quoted characters in comments? What? Restricting how? I thought we were talking about restricting the use of CTLs? Now some further rambling on the use of quoted-pair and the difficulties this causes for parsers: qdtext is for text within a quoted-string, and MUST NOT include '"' or '\'. Those two must be produced as quoted-pair to be used within a quoted-string. qdtext = OWS / %x21 / %x23-5B / %x5D-7E / obs-text ; OWS / <VCHAR except DQUOTE and "\"> / obs-text ctext is the same but for comment, and MUST NOT include '(', ')' or '\'. Those three must be produced as quoted-pair to be used within a comment. ctext = OWS / %x21-27 / %x2A-5B / %x5D-7E / obs-text ; OWS / <VCHAR except "(", ")", and "\"> / obs-text Neither of qdtext or ctext allows for CTLs, except for HT or obsoleted CRLF folding (from OWS). Specifications (2616) is very strict on where quoted-pair is alowed to be used, but it's at the same time very subtle where those areas are creating a large grey area where parsing is somewhat non-obvious. It's the same question as been raised earlier regarding comments. A construct looking like a comment is only a comment if the header in question is defined to allow comments, if not it's literally part of the header value. Quoted-string is also only quoted-string if the header in question is defined to accept quoted-string, if not it may be a literal part of the header value even if it may look like a quoted-string (for a header defined as taking *TEXT as value, 2616 has no such headers however) RFC2616 BNF and relevant comments: generic-message = start-line *(message-header CRLF) CRLF [ message-body ] message-header = field-name ":" [ field-value ] field-name = token field-value = *( field-content | LWS ) field-content = <the OCTETs making up the field-value and consisting of either *TEXT or combinations of token, separators, and quoted-string> TEXT = <any OCTET except CTLs, but including LWS> A CRLF is allowed in the definition of TEXT only as part of a header field continuation. Comments can be included in some HTTP header fields by surrounding the comment text with parentheses. Comments are only allowed in fields containing "comment" as part of their field value definition. In all other fields, parentheses are considered part of the field value. comment = "(" *( ctext | quoted-pair | comment ) ")" ctext = <any TEXT excluding "(" and ")"> The allowable characters in *TEXT overlaps completely with token, separators and quoted-string in the allowable characters except that *TEXT do not allow CTLs other than LWS (HT), and within *TEXT the '\' character have no special meaning. Which means that to properly parse '\' quoted constructs one must know in detail every header processed in order to know if the '\' is quoting the next character or if it's just a literal '\'. Because of this it's important that the overall message parsing is the same regardless if quoted-pair is processed or not, only producing slightly different results in the raw header value. Or put in other words, it needs to be possible to completely defer quoting and comment processing until the header value as such is examined in detail, with general message parsing using *TEXT for all header values. And for chunk headers *TEXT minus folding for the general message format, only needing to dive into quoting etc when eventually processing the chunk extension values (if at all). Regarding the allowable characters there imho is absolutely no need to allow for control characters anywhere in HTTP headers or chunk headers, quoted or not, and it's additionally very very likely many parsers will fail on such constructs making them quite non-interoperable. And additionally if restricting the allowed set of quoted characters to exclude \x00, NL and CR as already done in HTTPbis then it becomes very questionable from a technical point of view (ignoring parsing) to allow the use of other CTLs in quoted form. The use of having CTLs in header values is very limited to begin with, basically only needed to support transmission of (non-UTF8) multibyte charactersets or binary non-text data, in which case having those three excluded is already a signifcant issue for such use. So imho quoted-pair should be quoted-text = %x09 / %x20-%x7E / obs-text ; WSP / VCHAR / obs-text quoted-pair = "\" qchar to match the use of *TEXT in 2616, making comments and quoted strings all fit within *TEXT as those constructs is only used in detailed forms which should be a subset of the more generic *TEXT. This reasoning is also consistent with the current field-content definition using VTEXT etc.. field-value = *( field-content / OWS ) field-content = *( WSP / VCHAR / obs-text ) This field-content definition DOES NOT allow for CTLs other than HT. Allowing quoted-pair to include CTLs other than HT is incompatible with the above (from latest p1) definition of field-content. If you look closely you'll notice the quoted-text and field-contents definitions above are equal. Perhaps a common term should be defined for that similar to the *TEXT element used in 2616. There is probably more places where using said term would make sense. And sorry, no I do not have a good suggested BNF name for this construct.. TEXT would be confusing with 2616 and text in lower case too generic to be used in describing text. general-text? Regards Henrik |
|
|
Re: #173: CR and LF in chunk extension valuesHenrik Nordstrom wrote:
> Should probably change topic here, but it's still relevant so keeping > the issue topic. Most of this is taking a more generic view of > quoted-pair, not isolated to chunk extension values. > > tis 2009-08-25 klockan 09:11 +0200 skrev Julian Reschke: > >> quoted-pair is also used in comments. Are we ok with restricting the set >> here as well? And, if yes, shouldn't we then also adjust the allowed set >> for non-quoted characters in comments? > > What? Restricting how? I thought we were talking about restricting the > use of CTLs? Yes. I wanted to confirm that we do that for quoted-strings *and* comments. Do we? > Now some further rambling on the use of quoted-pair and the difficulties > this causes for parsers: > > > qdtext is for text within a quoted-string, and MUST NOT include '"' or > '\'. Those two must be produced as quoted-pair to be used within a > quoted-string. > > qdtext = OWS / %x21 / %x23-5B / %x5D-7E / obs-text > ; OWS / <VCHAR except DQUOTE and "\"> / obs-text > > ctext is the same but for comment, and MUST NOT include '(', ')' or '\'. > Those three must be produced as quoted-pair to be used within a comment. > > ctext = OWS / %x21-27 / %x2A-5B / %x5D-7E / obs-text > ; OWS / <VCHAR except "(", ")", and "\"> / obs-text > > Neither of qdtext or ctext allows for CTLs, except for HT or obsoleted > CRLF folding (from OWS). Yes. But quoted-string and comment allow quoted-pair which currently does allow CTLs. > Specifications (2616) is very strict on where quoted-pair is alowed to > be used, but it's at the same time very subtle where those areas are > creating a large grey area where parsing is somewhat non-obvious. > > It's the same question as been raised earlier regarding comments. A > construct looking like a comment is only a comment if the header in > question is defined to allow comments, if not it's literally part of the > header value. > > Quoted-string is also only quoted-string if the header in question is > defined to accept quoted-string, if not it may be a literal part of the > header value even if it may look like a quoted-string (for a header > defined as taking *TEXT as value, 2616 has no such headers however) > > RFC2616 BNF and relevant comments: > > generic-message = start-line > *(message-header CRLF) > CRLF > [ message-body ] > message-header = field-name ":" [ field-value ] > field-name = token > field-value = *( field-content | LWS ) > field-content = <the OCTETs making up the field-value > and consisting of either *TEXT or combinations > of token, separators, and quoted-string> > > TEXT = <any OCTET except CTLs, > but including LWS> > > A CRLF is allowed in the definition of TEXT only as part of a header > field continuation. > > Comments can be included in some HTTP header fields by surrounding > the comment text with parentheses. Comments are only allowed in > fields containing "comment" as part of their field value definition. > In all other fields, parentheses are considered part of the field > value. > > comment = "(" *( ctext | quoted-pair | comment ) ")" > ctext = <any TEXT excluding "(" and ")"> > > The allowable characters in *TEXT overlaps completely with token, > separators and quoted-string in the allowable characters except that > *TEXT do not allow CTLs other than LWS (HT), and within *TEXT the '\' > character have no special meaning. > > Which means that to properly parse '\' quoted constructs one must know > in detail every header processed in order to know if the '\' is quoting > the next character or if it's just a literal '\'. Yes. > Because of this it's important that the overall message parsing is the > same regardless if quoted-pair is processed or not, only producing > slightly different results in the raw header value. Or put in other > words, it needs to be possible to completely defer quoting and comment > processing until the header value as such is examined in detail, with > general message parsing using *TEXT for all header values. And for chunk > headers *TEXT minus folding for the general message format, only needing > to dive into quoting etc when eventually processing the chunk extension > values (if at all). > > > Regarding the allowable characters there imho is absolutely no need to > allow for control characters anywhere in HTTP headers or chunk headers, > quoted or not, and it's additionally very very likely many parsers will > fail on such constructs making them quite non-interoperable. Agreed. > And additionally if restricting the allowed set of quoted characters to > exclude \x00, NL and CR as already done in HTTPbis then it becomes very > questionable from a technical point of view (ignoring parsing) to allow > the use of other CTLs in quoted form. The use of having CTLs in header > values is very limited to begin with, basically only needed to support > transmission of (non-UTF8) multibyte charactersets or binary non-text > data, in which case having those three excluded is already a signifcant > issue for such use. Yes. > So imho quoted-pair should be > > quoted-text = %x09 / %x20-%x7E / obs-text > ; WSP / VCHAR / obs-text > quoted-pair = "\" qchar > > to match the use of *TEXT in 2616, making comments and quoted strings > all fit within *TEXT as those constructs is only used in detailed forms > which should be a subset of the more generic *TEXT. "qchar" being...? > This reasoning is also consistent with the current field-content > definition using VTEXT etc.. > > field-value = *( field-content / OWS ) > field-content = *( WSP / VCHAR / obs-text ) > > This field-content definition DOES NOT allow for CTLs other than HT. > Allowing quoted-pair to include CTLs other than HT is incompatible with > the above (from latest p1) definition of field-content. > > If you look closely you'll notice the quoted-text and field-contents > definitions above are equal. Perhaps a common term should be defined for > that similar to the *TEXT element used in 2616. There is probably more > places where using said term would make sense. And sorry, no I do not > have a good suggested BNF name for this construct.. TEXT would be > confusing with 2616 and text in lower case too generic to be used in > describing text. general-text? > ... "characters"? Anyway, my take away from your analysis is: "yes, CTLs need to be disallowed both in comments and quoted-text", right? BR, julian |
|
|
Re: #173: CR and LF in chunk extension valuesJulian Reschke wrote:
> ... OK, so my understanding is that we disallow all control characters except HTAB in comment and quoted-string, escaped or not. Proposed patch: <http://trac.tools.ietf.org/wg/httpbis/trac/attachment/ticket/173/173.diff>. Relevant changes in Part 1: -- snip -- A string of text is parsed as a single word if it is quoted using double-quote marks. quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE qdtext = WSP / %x21 / %x23-5B / %x5D-7E / obs-text ; WSP / <VCHAR except DQUOTE and "\"> / obs-text obs-text = %x80-FF The backslash character ("\") can be used as a single-character quoting mechanism only within quoted-string and comment constructs: quoted-pair = "\" ( WSP / VCHAR / obs-text ) Note that quoted-pair includes those characters otherwise disallowed in quoted-string or comment (Section 3.2). ... Comments can be included in some HTTP header fields by surrounding the comment text with parentheses. Comments are only allowed in fields containing "comment" as part of their field value definition. comment = "(" *( ctext / quoted-pair / comment ) ")" ctext = WSP / %x21-27 / %x2A-5B / %x5D-7E / obs-text ; WSP / <VCHAR except "(", ")", and "\"> / obs-text ... Rules about implicit linear whitespace between certain grammar productions have been removed; now it's only allowed when specifically pointed out in the ABNF. Control characters other than HTAB are no longer allowed in comment and quoted-string text (escaped or not). Non-ASCII content in header fields and reason phrase has been obsoleted and made opaque (the TEXT rule was removed) (Section 1.2.2) -- snip -- Feedback appreciated, Julian |
|
|
Re: #173: CR and LF in chunk extension valuestis 2009-08-25 klockan 14:47 +0200 skrev Julian Reschke:
> > So imho quoted-pair should be > > > > quoted-text = %x09 / %x20-%x7E / obs-text > > ; WSP / VCHAR / obs-text > > quoted-pair = "\" qchar > > > > to match the use of *TEXT in 2616, making comments and quoted strings > > all fit within *TEXT as those constructs is only used in detailed forms > > which should be a subset of the more generic *TEXT. > > "qchar" being...? A typo quoted-pair = "\" quoted-text > > If you look closely you'll notice the quoted-text and field-contents > > definitions above are equal. Perhaps a common term should be defined for > > that similar to the *TEXT element used in 2616. There is probably more > > places where using said term would make sense. And sorry, no I do not > > have a good suggested BNF name for this construct.. TEXT would be > > confusing with 2616 and text in lower case too generic to be used in > > describing text. general-text? > > ... > > "characters"? Is WSP and obs-text characters? Other than that no opinion either way.. > Anyway, my take away from your analysis is: "yes, CTLs need to be > disallowed both in comments and quoted-text", right? Yes. CTLs should be disallowed in quoted-pair except for those included in WSP (HT). Regards Henrik |
|
|
Re: #173: CR and LF in chunk extension valuestis 2009-08-25 klockan 15:29 +0200 skrev Julian Reschke:
> Julian Reschke wrote: > > ... > > OK, so my understanding is that we disallow all control characters > except HTAB in comment and quoted-string, escaped or not. Yes. > Proposed patch: > <http://trac.tools.ietf.org/wg/httpbis/trac/attachment/ticket/173/173.diff>. > specifically pointed out in the ABNF. Control characters other than > HTAB are no longer allowed in comment and quoted-string text (escaped > or not). Note: CRLF in the form of obs-fold is still allowed in both, just as it has always been. It's just quoting using '\' which has been restricted. > Feedback appreciated, Looks good to me. Regards Henrik |
|
|
Re: #173: CR and LF in chunk extension valuesHenrik Nordstrom wrote:
> tis 2009-08-25 klockan 15:29 +0200 skrev Julian Reschke: >> Julian Reschke wrote: >>> ... >> OK, so my understanding is that we disallow all control characters >> except HTAB in comment and quoted-string, escaped or not. > > Yes. > >> Proposed patch: >> <http://trac.tools.ietf.org/wg/httpbis/trac/attachment/ticket/173/173.diff>. > >> specifically pointed out in the ABNF. Control characters other than >> HTAB are no longer allowed in comment and quoted-string text (escaped >> or not). > > Note: CRLF in the form of obs-fold is still allowed in both, just as it > has always been. It's just quoting using '\' which has been restricted. > >> Feedback appreciated, > > Looks good to me. > ... OK, I have applied the change with <http://trac.tools.ietf.org/wg/httpbis/trac/changeset/686>. BR, Julian |
|
|
Re: #173: CR and LF in chunk extension valuestor 2009-08-27 klockan 11:53 +0200 skrev Julian Reschke:
> OK, I have applied the change with > <http://trac.tools.ietf.org/wg/httpbis/trac/changeset/686>. Looking again.. and no it's not entirely fine. ctext and qdtext should not be changed from OWS to WSP. The change is only in quoted-text. We can not disallow folding here. Sorry for not seeing this earlier. Regards Henrik |
|
|
Re: #173: CR and LF in chunk extension valuesHenrik Nordstrom wrote:
> tor 2009-08-27 klockan 11:53 +0200 skrev Julian Reschke: > >> OK, I have applied the change with >> <http://trac.tools.ietf.org/wg/httpbis/trac/changeset/686>. > > Looking again.. and no it's not entirely fine. > > ctext and qdtext should not be changed from OWS to WSP. The change is > only in quoted-text. We can not disallow folding here. > > Sorry for not seeing this earlier. > ... It happens; thanks for checking anyway (and this was exactly the reason I wanted people to verify this change :-). For now, I undid the change with <http://trac.tools.ietf.org/wg/httpbis/trac/changeset/687>. It appears that we *do* have consensus for disallowing controls in quoted-pairs, thus for: quoted-pair = "\" ( WSP / VCHAR / obs-text ) However, if that's all that we do we won't have addresses issue #173 after all. Proposal: - add a new issue for disallowing CTLs in quoted-pair - address #173 by tuning the definition of chunk-ext-val BR, Julian |
|
|
Re: #173: CR and LF in chunk extension valuestor 2009-08-27 klockan 14:03 +0200 skrev Julian Reschke:
> It appears that we *do* have consensus for disallowing controls in > quoted-pairs, thus for: > > quoted-pair = "\" ( WSP / VCHAR / obs-text ) Yes. > However, if that's all that we do we won't have addresses issue #173 > after all. Indeed. > Proposal: > > - add a new issue for disallowing CTLs in quoted-pair Yes. > - address #173 by tuning the definition of chunk-ext-val Which means defining a new variant of quoted-string which do not allow for folding for use in chunk-ext-val. chunk-ext-val = token / quoted-string-nf quoted-string-nf = DQUOTE *( qdtext-nf / quoted-pair ) DQUOTE qdtext-nf = WSP / %x21 / %x23-5B / %x5D-7E / obs-text ; WSP / <VCHAR except DQUOTE and "\"> / obs-text assuming quoted-pair is fixed as discussed. Perhaps is should also be noted in text that folding is explicitly forbidden in chunk headers. Comments are thankfully not allowed in chunk extensions from what I can tell. Regards Henrik |
|
|
Re: #173: CR and LF in chunk extension valuesOn Aug 27, 2009, at 5:03 AM, Julian Reschke wrote:
> It appears that we *do* have consensus for disallowing controls in > quoted-pairs, thus for: > > quoted-pair = "\" ( WSP / VCHAR / obs-text ) > > However, if that's all that we do we won't have addresses issue > #173 after all. > > Proposal: > > - add a new issue for disallowing CTLs in quoted-pair I suggest we make the issue "Disallow quoted-pair productions that are never used in practice nor needed for parsing", with the fix being quoted-pair = "\" ( "\" / DQUOTE / "(" / ")" ) ....Roy |
| < Prev | 1 - 2 - 3 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |