But what happens if the last character written by document.write() is a
The HTML parsing spec says that CR followed by LF is ignored but CR
followed by anything else is converted to LF. So if the last character
is CR, then the tokenizer can't process all characters up to the
insertion point because it needs to lookahead at the next character, right?
Firefox, Chrome and Safari all seem to do the right thing: wait for the
next character before tokenizing the CR. And I think this means that
the description of document.write needs to be changed. (Opera, on the
other hand, just gets this wrong and emits a CR character).
Similarly, what should the tokenizer do if the document.write emits half
of a UTF-16 surrogate pair as the last character?