|
View:
New views
12 Messages
—
Rating Filter:
Alert me
|
|
|
|
|
|
Re: sed strips CRsOn Mon, Feb 13, 2012 at 9:12 AM, Eric Blake wrote:
> > Personally, I think it is a bug that upstream sed is using 't' in > fopen() in the first place. Linux does NOT have an 'rt' mode for a > reason: 't' is non-standard. On cygwin, the preference used in > coreutils is that you get text mode by using 'r' and binary mode by > using 'rb', on the mount points where text mode matters; you should > almost never use 'rt' which forces text mode even on binary mounts. > That is, sed should be just fine using 'r' instead of 'rt', and it would > fix the perceived broken behavior on cygwin binary mounts. > > But fixing this should be done upstream, and not in cygwin. I've stayed away from voicing personal feelings. While modifying upstream certainly would resolve the issue of CRLF being read in "text" mode; I, on the other hand, believe that Cygwin should open the file descriptor in binary mode regardless. Note, though, the difference between normal processing mode in sed and versus sed -b is one of line mode versus buffered mode because you can't treat a binary data file as text lines. Modifying upstream would destroy those systems that require 'rt' to operate in text mode and I'm not meaning Windows; I don't know if any do. -- Earnie -- https://sites.google.com/site/earnieboyd |
|
|
Re: sed strips CRsOn 02/13/2012 03:12 PM, Eric Blake wrote:
> But fixing this should be done upstream, and not in cygwin. As long as it's consistent with coreutils I'll certainly do the change. Paolo |
|
|
Re: sed strips CRs[Sent again. I missed all the CC's in my previous reply. Sorry!]
On Feb 13 15:37, Paolo Bonzini wrote: > On 02/13/2012 03:12 PM, Eric Blake wrote: > >But fixing this should be done upstream, and not in cygwin. > > As long as it's consistent with coreutils I'll certainly do the change. > > Paolo Thanks! Would you mind to CC the cygwin list when the next upstream sed release is available? Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat |
|
|
Re: sed strips CRsOn 02/13/2012 03:56 PM, Corinna Vinschen wrote:
> > As long as it's consistent with coreutils I'll certainly do the change. > > Thanks! Would you mind to CC the cygwin list when the next upstream > sed release is available? Sure, it should be real soon now since a new release has been long overdue. By the way, I'm still opening the script file with "rt". I cannot think of any case when you would want to keep CRs there. Paolo |
|
|
Re: sed strips CRsOn Feb 13 16:22, Paolo Bonzini wrote:
> On 02/13/2012 03:56 PM, Corinna Vinschen wrote: > >> As long as it's consistent with coreutils I'll certainly do the change. > > > >Thanks! Would you mind to CC the cygwin list when the next upstream > >sed release is available? > > Sure, it should be real soon now since a new release has been long overdue. > > By the way, I'm still opening the script file with "rt". I cannot > think of any case when you would want to keep CRs there. Indeed, that sounds like the right thing to do. Thank you, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat |
|
|
Re: sed strips CRsOn Mon, Feb 13, 2012 at 10:22 AM, Paolo Bonzini <bonzini@...> wrote:
> On 02/13/2012 03:56 PM, Corinna Vinschen wrote: >> >> > As long as it's consistent with coreutils I'll certainly do the change. >> >> Thanks! Would you mind to CC the cygwin list when the next upstream >> sed release is available? > > > Sure, it should be real soon now since a new release has been long overdue. > > By the way, I'm still opening the script file with "rt". I cannot think of > any case when you would want to keep CRs there. The case of sed -e 's/something/nothing/g' myfile > myfile2 as it works in Cygwin today would mean that in the case of the OP's drive settings myfile2 would not contain the CR. Treating CR as white space is the more proper thing to do, IMO. -- Earnie -- https://sites.google.com/site/earnieboyd |
|
|
Re: sed strips CRsOn 02/13/2012 04:43 PM, Earnie Boyd wrote:
>> > >> > By the way, I'm still opening the script file with "rt". I cannot think of >> > any case when you would want to keep CRs there. > The case of > > sed -e 's/something/nothing/g' myfile > myfile2 > > as it works in Cygwin today would mean that in the case of the OP's > drive settings myfile2 would not contain the CR. Treating CR as white > space is the more proper thing to do, IMO. myfile is not the script file. The script file is the one that you pass to -f. Using "rt" was introduced in both cases for Cygwin, so regressions on other systems shouldn't be a problem. Paolo |
|
|
Re: sed strips CRsPaolo Bonzini scripsit:
> By the way, I'm still opening the script file with "rt". I cannot think > of any case when you would want to keep CRs there. You wouldn't, but the point is that "rt" isn't defined on Posix systems. If it happens to be the same as "r", good, but that isn't guaranteed. And the only time "rt" does anything different from "r" on a Win32 system is when you have: 1) linked your executable with the system-supplied 'binmode.obj' file 2) set the global variable _fmode to O_BINARY 3) invoked _set_fmode(O_BINARY) all of which make "r" synonymous with "rb". Programs which don't do any of these should use "r" rather than "rt", as it is guaranteed to do the right thing for text on both Win32 and Posix systems. -- You annoy me, Rattray! You disgust me! John Cowan You irritate me unspeakably! Thank Heaven, cowan@... I am a man of equable temper, or I should http://www.ccil.org/~cowan scarcely be able to contain myself before your mocking visage. --Stalky imitating Macrea |
|
|
Re: sed strips CRsOn 02/13/2012 08:42 PM, John Cowan wrote:
>> > By the way, I'm still opening the script file with "rt". I cannot think >> > of any case when you would want to keep CRs there. > You wouldn't, but the point is that "rt" isn't defined on Posix systems. > If it happens to be the same as "r", good, but that isn't guaranteed. Yes, I added a configure-time check too. I assume that if "rt" works, it can be used instead of "r". > And the only time "rt" does anything different from "r" on a Win32 system > is when you have: > > 1) linked your executable with the system-supplied 'binmode.obj' file > > 2) set the global variable _fmode to O_BINARY > > 3) invoked _set_fmode(O_BINARY) > > all of which make "r" synonymous with "rb". Programs which don't do any > of these should use "r" rather than "rt", as it is guaranteed to do the > right thing for text on both Win32 and Posix systems. No, "rt" also does something different than "r" on Cygwin with binary-mounts. If you meant that "rt" should be restricted to cygwin, that's also fine by me but in general I prefer feature tests to OS tests. Paolo |
|
|
Re: sed strips CRsOn Mon, Feb 13, 2012 at 2:48 PM, Paolo Bonzini wrote:
> > If you meant that "rt" should be restricted to cygwin, that's also fine by > me but in general I prefer feature tests to OS tests. > Then it becomes Cygwin's problem. I'm going to quote from http://msdn.microsoft.com/en-us/library/yeby3zcb.aspx <quote> t Open in text (translated) mode. In this mode, CTRL+Z is interpreted as an EOF character on input. In files that are opened for reading/writing by using "a+", fopen checks for a CTRL+Z at the end of the file and removes it, if possible. This is done because using fseek and ftell to move within a file that ends with CTRL+Z may cause fseek to behave incorrectly near the end of the file. In text mode, carriage return–linefeed combinations are translated into single linefeeds on input, and linefeed characters are translated to carriage return–linefeed combinations on output. When a Unicode stream-I/O function operates in text mode (the default), the source or destination stream is assumed to be a sequence of multibyte characters. Therefore, the Unicode stream-input functions convert multibyte characters to wide characters (as if by a call to the mbtowc function). For the same reason, the Unicode stream-output functions convert wide characters to multibyte characters (as if by a call to the wctomb function). </quote> So does Cygwin really want to specify "rt"? I would rather sed specify "rb" and treat the CR as white space. I know that treating CR as white space works well. -- Earnie -- https://sites.google.com/site/earnieboyd |
|
|
|
| Free embeddable forum powered by Nabble | Forum Help |