relative uri

View: New views
2 Messages — Rating Filter:   Alert me  

relative uri

by François Delyon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi all,

I have some remarks about the implementation of xmlBuildRelativeURI  
in uri.c (libxml 2.7.3)

1-critical bug
line 2371
        if ((ref->path[pos] == '.') && (ref->path[pos+1] == '/'))
             pos += 2;
        if ((*bptr == '.') && (bptr[1] == '/'))
             bptr += 2;
        else if ((*bptr == '/') && (ref->path[pos] != '/'))
            bptr++;
        while ((bptr[pos] == ref->path[pos]) && (bptr[pos] != 0))
            pos++;

        if (bptr[pos] == ref->path[pos]) {
            val = xmlStrdup(BAD_CAST "");
            goto done; /* (I can't imagine why anyone would do this) */
        }

the use of bptr[pos]  is irrelevant. I suspect that the first four  
lines have been added quickly to solve some bad case.
My opinion is that the four first lines are not very important (the  
same kind of problem may occur later in the path) and that  
xmlBuildRelativeURI should normalize the paths before processing in  
order to get human and efficient  relative uris.

2- bad answer.
if the basepath is "/a/b" and the path is "/a/c:d",  
xmlBuildRelativeURI return "c:d" which is not a relative uri. The  
relative uri is "./c:d"
This remark is perhaps not very useful for libxml2 since  
xmlBuildRelativeURI is called in very few places.

3- cosmetic problem
line 2392
        else if ((ref->path[ix] == 0) && (ix > 1) && (ref->path[ix - 1] ==  
'/'))
            ix -= 2;
basepath "/a/b" path "/a/c" gives "../a/c" where I expect "c"
and
basepath "/a/b" path "/a/" gives "../a/" where I expect "."

4-question
  xmlBuildRelativeURI  return things like xmlURIEscapeStr(uptr,  
BAD_CAST "/;&=+$,").
Must I understand that the path in a xmlURIPtr is not escaped?
I think that I missed something.

------

I am not familiar with the development of libxml, but I can suggest  
the following code to build a relative path.

François.




  /*
assume that basepath and path are normalized
return 0 only if malloc fails
*/
char* RelativePath(const char* basepath,const char* path){
     int pos = 0;
     int nbslash = 0;
     int len;
        const char *bptr, *uptr;char *vptr,*val=0;
        if(path==0 || *path==0)
                return RelativePath(basepath, "/");
     if (basepath == NULL || *basepath==0) {
                if (path != NULL) {
                        uptr = path;
                        if (*uptr == '/')
                        uptr++;
                        return  strdup(uptr);
                }else
                        return strdup("");
     }
     bptr = basepath;
     if (path == NULL || *path==0) {
                for (pos = 0; bptr[pos] != 0; pos++) {
                        if (bptr[pos] == '/')
                        nbslash++;
                }
                val = (char *) malloc (1 + 3 * nbslash);
                if (val == NULL)
                        return val;
                vptr = val;
                for (; nbslash>0; nbslash--) {
                        *vptr++ = '.';
                        *vptr++ = '.';
                        *vptr++ = '/';
                }
                *vptr=0;
                return val;
     }
     /*
      * Next we compare the two strings and find where they first differ
      */
        while ((bptr[pos] == path[pos]) && (bptr[pos] != 0))
            pos++;

        if (bptr[pos] == path[pos])
            return strdup("");

        if ((path[pos] == '/') && (pos > 0))
            pos--;
        for (; pos > 0; pos--) {
                if (path[pos] == '/'){
                        pos++;
                        break;
                }
        }
        uptr = &path[pos];
        /*
         * In base, count the number of '/' from the differing point
         */
        for (; bptr[pos] != 0; pos++) {
        if (bptr[pos] == '/')
                nbslash++;
        }
        len = strlen(uptr) + 1;

     if (nbslash == 0) {
                if (uptr != NULL && *uptr!=0){
                        /* "a:b" is not a relative uri, use "./a:b" instead */
                        for(bptr=uptr;*bptr;bptr++){
                                if(*bptr=='/')break;
                                if(*bptr==':'){
                                        len=strlen(uptr);
                                        val=(char*)malloc(len+3);
                                        if(!val)return 0;
                                        val[0]='.';val[1]='/';
                                        memcpy(val+2,uptr,len+1);
                                        return val;
                                }
                        }
                        return strdup (uptr);
                }else
                /* base="/a/b" path="/a/" ->"." */
                        return strdup (".");
     }

     /*
      * Allocate just enough space for the returned string -
      * length of the remainder of the URI, plus enough space
      * for the "../" groups, plus one for the terminator
      */
     val = (char *) malloc (len + 3 * nbslash);
     if (val == NULL)
                return val;
     vptr = val;
     for (; nbslash>0; nbslash--) {
                *vptr++ = '.';
                *vptr++ = '.';
                *vptr++ = '/';
     }
     /*
      * Finish up with the end of the URI
      */
     if (uptr != NULL) memcpy (vptr, uptr, len);
        vptr[len - 1] = 0;
     return val;
}

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@...
http://mail.gnome.org/mailman/listinfo/xml

Re: relative uri

by Daniel Veillard :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Oct 20, 2009 at 04:51:58PM +0200, François Delyon wrote:

> Hi all,
>
> I have some remarks about the implementation of xmlBuildRelativeURI  in
> uri.c (libxml 2.7.3)
>
> 1-critical bug
> line 2371
> if ((ref->path[pos] == '.') && (ref->path[pos+1] == '/'))
>             pos += 2;
> if ((*bptr == '.') && (bptr[1] == '/'))
>             bptr += 2;
> else if ((*bptr == '/') && (ref->path[pos] != '/'))
>    bptr++;
> while ((bptr[pos] == ref->path[pos]) && (bptr[pos] != 0))
>    pos++;
>
> if (bptr[pos] == ref->path[pos]) {
>    val = xmlStrdup(BAD_CAST "");
>    goto done; /* (I can't imagine why anyone would do this) */
> }
>
> the use of bptr[pos]  is irrelevant. I suspect that the first four lines
> have been added quickly to solve some bad case.

 Why ? Example, reason to your jugement ?

> My opinion is that the four first lines are not very important (the same
> kind of problem may occur later in the path) and that  

 that's basically looks like URI normalization see
   http://tools.ietf.org/html/rfc3986
section 5.2.4, though that code was done with preceeding RFC 2396 in
mind.

> xmlBuildRelativeURI should normalize the paths before processing in  
> order to get human and efficient  relative uris.

  yes that could be improved that way, using xmlNormalizeURIPath()

> 2- bad answer.
> if the basepath is "/a/b" and the path is "/a/c:d", xmlBuildRelativeURI
> return "c:d" which is not a relative uri. The relative uri is "./c:d"
> This remark is perhaps not very useful for libxml2 since  
> xmlBuildRelativeURI is called in very few places.

  Hum, right that's a special case if : is in the first segment of the
remaining path, but it sounds more like an escaping problem to me, the :
should be escaped to make sure it's not misinterpreted at parsing time
  "c%3Ad" rather than "./c:d"
one way is to extend the xmlURIEscapeStr at the end to also convert :

> 3- cosmetic problem
> line 2392
> else if ((ref->path[ix] == 0) && (ix > 1) && (ref->path[ix - 1] ==  
> '/'))
>    ix -= 2;
> basepath "/a/b" path "/a/c" gives "../a/c" where I expect "c"
> and
> basepath "/a/b" path "/a/" gives "../a/" where I expect "."

  hum, more fixes and tests would be nice, yes

> 4-question
>  xmlBuildRelativeURI  return things like xmlURIEscapeStr(uptr, BAD_CAST
> "/;&=+$,").
> Must I understand that the path in a xmlURIPtr is not escaped?
> I think that I missed something.

  the strings in xmlURI, i.e. after parsing are no more escaped.
but since we return a serialized URI then escaping need to be done
there.

> ------
>
> I am not familiar with the development of libxml, but I can suggest the
> following code to build a relative path.

  Can you please apply some of the above comments and provide a
contextual diff, (using diff -c or dff -u on the files before or after
the change, with git, a git diff on the command line will show you the
diff from the modified files) . Then provide the patch as an attachment
to make sure it won't me modified by mail transmission,

 thanks,

Daniel

--
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel@...  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@...
http://mail.gnome.org/mailman/listinfo/xml