> I recently gave the mime-sniff a somewhat closer look,
> including these two paragraphs, which looked familiar:
>
> [[
> This document describes a mime sniffing algorithm that carefully
> balances the compatibility needs of browser vendors with the
> security
> constraints. The algorithm has been constructed with reference to
> mime sniffing algorithms present in popular Web browsers, an
> extensive database of Web content, and metrics collected from
> implementations deployed to a sizable number of Web users.
>
> Warning! It is imperative that the algorithm in this document be
> followed exactly. When a user agent uses different heuristics for
> content type detection than the server expects, security problems
> can
> occur. For example, if a server believes that the client will treat
> a contributed file as an image (and thus treat it as benign), but a
> Web browser believes the content to be HTML (and thus execute any
> scripts contained therein), the end user can be exposed to malicious
> content, making the user vulnerable to cookie theft attacks and
> other
> cross-site scripting attacks.
> ]]
> --
http://ietfreport.isoc.org/idref/draft-abarth-mime-sniff/>
> I had an uneasiness about them that I wasn't sure how to articulate,
> but then I just read this:
>
> -------- Forwarded Message --------
>
http://lists.w3.org/Archives/Public/public-html/2009May/0524.html>> From: Sam Ruby <
rubys@...>
>> To: Anne van Kesteren <
annevk@...>
>> Cc: Maciej Stachowiak <
mjs@...>, Roy T. Fielding
>> <
fielding@...>, Larry Masinter <
masinter@...>, HTML WG
>> <
public-html@...>
>> Subject: Re: HTML interpreter vs. HTML user agent
>> Date: Thu, 28 May 2009 09:41:36 -0400
> [...]
>> The actual observed behavior of user agents designed to (primarily)
>> process content of a certain media type (either in general, or in the
>> specific context) is to make every effort to parse the content
>> according
>> to those rules, and only if such rules fail to produce meaningful
>> results will they investigate alternatives.
>>
>> Browsers will first attempt to process content as HTML.
>> FeedReaders will first attempt to process content as a feed.
>> Media plays will first attempt to process content as media.
>>
>> Browsers, when chasing an image tag, will make different assumptions
>> than when presented with a raw uri from the chrome.
>>
>> All are equally "right" or "wrong".
>>
>> None of this is meant to imply that the behavior that is being
>> settled
>> upon by browser manufacturers isn't worth specifying or
>> standardizing.
>>
>> - Sam Ruby
>
> Is there any reason to believe that the next sort of content
> to hit the web won't disrupt things much like java .jar files
> and RSS/Atom feeds and mp3/wma media?
>
> I think it's worthwhile to update our finding on authoritative
> metadata* to acknowledge draft-abarth-mime-sniff and the practice
> it represents... but I'm struggling to figure out exactly
> what to say.
>
> *
http://www.w3.org/2001/tag/doc/mime-respect-20060412>
> It's pretty clear to me that people will take the shortest path
> to their target, and that usually doesn't involve editing
> the .htaccess file when they test their RSS file with their
> RSS readers. It's not until the RSS reader gets integrated
> into the web browser that the HTTP client's presumption
> is that it's getting a feed goes away (and even then,
> not completely).
>
>
> --
> Dan Connolly, W3C
http://www.w3.org/People/Connolly/> gpg D3C2 887B 0F92 6005 C541 0875 0F91 96DE 6E52 C29E
>
>