Windows drive letter check fails on lower case cwd

View: New views
5 Messages — Rating Filter:   Alert me  

Windows drive letter check fails on lower case cwd

by Bert Huijben (TCG) :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[CC to subversion development list]

        Hi,

A few weeks ago a Subversion user reported that some of the subversion path tests failed for him. In particular our call to

apr_filepath_merge(&buffer, NULL, "C:hi", APR_FILEPATH_NOTRELATIVE, pool);

returned an error instead of an absolute path version of "hi" on drive C: for him.

List thread: http://svn.haxx.se/dev/archive-2009-10/0682.shtml


After some investigation I found that I can replicate this failure when I change the current directory to a path with a lower case drive letter.

E.g. by typing

C:\> cd c:\windows

I switch to "c:\windows" (and not C:\Windows as commonly expected).

In this specific case resolving the "C:hi" path style fails, because the check on line 543 of file_io/win32/filepath.c fails:

            if (addroot[0] == baseroot[0] && baseroot[1] == ':') {

As it assumes that the driveletters of the input and output paths always match.

After applying this simple patch:

Index: file_io/win32/filepath.c
===================================================================
--- file_io/win32/filepath.c    (revision 832725)
+++ file_io/win32/filepath.c    (working copy)
@@ -540,7 +540,8 @@
              * use the basepath _if_ it matches this drive letter!
              * Otherwise we must discard the basepath.
              */
-            if (addroot[0] == baseroot[0] && baseroot[1] == ':') {
+            if (apr_toupper(addroot[0]) == apr_toupper(baseroot[0])
+                && baseroot[1] == ':') {
 #endif
                 /* Base the result path on the basepath
                  */

The call will return me "c:/windows/hi" and no error.

(For Subversion the answers "c:/windows/hi" and "C:/windows/hi" are both valid as we normalize the resulting drive letter to upper case directly after reading it. Normalizing to upper case might be preferred for apr itself)

Proposed log message:
[[
Use case insensitive drive letter comparison when making "C:hi"-like paths
absolute on Windows
]]


I'm not sure where/how I can add a test for this specific test in the apr test suite as I can't find a relevant test to add it too. But I did add a test in the Subversion project repository in r40370, which directly shows this behavior. (Currently the test is marked as XFail).


In the next Subversion release, this call to apr_filepath_merge() will be far more common than in all our previous releases as we switch to absolute paths for most processing, so it would be nice if we can get this issue fixed upstream well before we get to releasing Subversion 1.7 (and its planned betas).


Subversion tells its users it is compatible with apr 0.9 and 1.X, so I would also like to propose it for backport to older release's...

Thanks,

        Bert Huijben


Parent Message unknown RE: Windows drive letter check fails on lower case cwd

by Bert Huijben-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



> -----Original Message-----
> From: Philip Martin [mailto:philip@...]
> Sent: zaterdag 7 november 2009 9:41
> To: Bert Huijben
> Cc: dev@...; dev@...
> Subject: Re: Windows drive letter check fails on lower case cwd
>
> Bert Huijben <rhuijben@...> writes:
>
> > Index: file_io/win32/filepath.c
> > ===================================================================
> > --- file_io/win32/filepath.c    (revision 832725)
> > +++ file_io/win32/filepath.c    (working copy)
> > @@ -540,7 +540,8 @@
> >               * use the basepath _if_ it matches this drive letter!
> >               * Otherwise we must discard the basepath.
> >               */
> > -            if (addroot[0] == baseroot[0] && baseroot[1] == ':') {
> > +            if (apr_toupper(addroot[0]) == apr_toupper(baseroot[0])
> > +                && baseroot[1] == ':') {
> >  #endif
> >                  /* Base the result path on the basepath
> >                   */
> >
> > The call will return me "c:/windows/hi" and no error.
> >
> > (For Subversion the answers "c:/windows/hi" and "C:/windows/hi" are
> both valid as we normalize the resulting drive letter to upper case
> directly after reading it. Normalizing to upper case might be preferred
> for apr itself)
> >
> > Proposed log message:
> > [[
> > Use case insensitive drive letter comparison when making "C:hi"-like
> paths
> > absolute on Windows
> > ]]
>
> How does Win32 handle Turkish?  On Linux apr_toupper is locale aware
> so in a Turkish locale (tr_TR) apr_toupper(i) != apr_toupper(I).  Do
> "i:/" and "I:/" refer to the same drive in Turkish?  Does apr_toupper
> have the same behaviour as drive letters?

The drive letters don't have locales; the rest of the paths have. There are
only 26 driveletters with the US-ASCII characters A-Z. (Internally always
represented by the upper case letters, but the current path can use a lower
case path, as that is only managed in userspace).

I think a wrapper which maps 'a'-'z' explicitly might be a bit better, to
resolve this specific issue..

Something like: (untested)

char drive_to_upper(char letter)
{
  if (letter >= 'a' && letter <= 'z')
    return letter - 'a' + 'A';
  else
    return letter;
}

But the case this patch resolves an issue that has been reported only once
over the last 9 years (at least for Subversion)... and the only place where
this apr_toupper()would fail is when this specific case happens.. combined
with this Turkish locale and the drive I:, which is very uncommon too...

Good theoretical bug ;-)

        Bert


Parent Message unknown Re: Windows drive letter check fails on lower case cwd

by Daniel Shahaf-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Philip Martin wrote on Sat, 7 Nov 2009 at 21:28 -0000:
> "Bert Huijben" <bert@...> writes:
>
> > Good theoretical bug ;-)
>
> It was a real Subversion bug. One of -rCOMMITTED or -rcommitted didn't
> work on Linux in tr_TR when we used a locale aware casecmp.

Subversion is probably not alone in assuming that 'I' = uppercase('i').
I'd be interested in hearing from someone who *uses* tr_TR how they
solve it in the general case (i.e., without patching each program
individually).

(Or perhaps we should have *three* separate character pairs:
 [Ii]-with-dot, [Ii]-without-dot, [Ii]-of-unspecified-dotfulness,
where the third is the ASCII one.)

Re: Windows drive letter check fails on lower case cwd

by William A. Rowe Jr. :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bert Huijben wrote:
>
> The drive letters don't have locales; the rest of the paths have. There are
> only 26 driveletters with the US-ASCII characters A-Z. (Internally always
> represented by the upper case letters, but the current path can use a lower
> case path, as that is only managed in userspace).

That's an odd statement; all paths are Unicode ;-)  We don't strcasecmp paths.

> I think a wrapper which maps 'a'-'z' explicitly might be a bit better, to
> resolve this specific issue..
>
> Something like: (untested)
>
> char drive_to_upper(char letter)
> {
>   if (letter >= 'a' && letter <= 'z')
>     return letter - 'a' + 'A';
>   else
>     return letter;
> }

A possibility, it seems reasonable.

> But the case this patch resolves an issue that has been reported only once
> over the last 9 years (at least for Subversion)... and the only place where
> this apr_toupper()would fail is when this specific case happens.. combined
> with this Turkish locale and the drive I:, which is very uncommon too...
>
> Good theoretical bug ;-)

:)

RE: Windows drive letter check fails on lower case cwd

by Bert Huijben-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> -----Original Message-----
> From: William A. Rowe, Jr. [mailto:wrowe@...]
> Sent: maandag 9 november 2009 4:25
> To: Bert Huijben
> Cc: 'Philip Martin'; 'Bert Huijben'; dev@...;
> dev@...
> Subject: Re: Windows drive letter check fails on lower case cwd
>
> Bert Huijben wrote:
> >
> > The drive letters don't have locales; the rest of the paths have.
> There are
> > only 26 driveletters with the US-ASCII characters A-Z. (Internally
> always
> > represented by the upper case letters, but the current path can use a
> lower
> > case path, as that is only managed in userspace).
>
> That's an odd statement; all paths are Unicode ;-)  We don't strcasecmp
> paths.

s/path/drive in its path/ for that last path.. thanks :)

Paths are (of course) unicode, but use a some culture sensitive compare to handle the case insensitivity. But there is no documented way to find which locale it uses for each (part of a) drive.
(That information is stored in the system portion of NTFS at format time and can vary over directories via junctions, etc.)

Thanks for looking into this.

        Bert