Question about KUrl::equals/cmp...

View: New views
7 Messages — Rating Filter:   Alert me  

Question about KUrl::equals/cmp...

by Bugzilla from adawit@kde.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Is there a reason why KUrl::equals with its second parameter set to
KUrl::CompareWithoutTrailingSlash or KUrl::cmp's second parameter set to true
returns "http://www.kde.org/" and "http://www.kde.org" are not equal to each
other ?

Re: Question about KUrl::equals/cmp...

by Bugzilla from faure@kde.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 04 November 2009, Dawit A. wrote:
> Is there a reason why KUrl::equals with its second parameter set to
> KUrl::CompareWithoutTrailingSlash or KUrl::cmp's second parameter set to
>  true returns "http://www.kde.org/" and "http://www.kde.org" are not equal
>  to each other ?
 
Ah, I think I know. Tricky: with http these urls are equal, while with e.g.
FTP they are not, since ftp://user@host redirects to ftp://user@host/home/user
(on a linux server), while ftp://user@host/ is the root dir.
Basically we treat "no path" as "please go to the default starting directory".
But I'm ok with an exception for HTTP, where indeed people expect these
two urls to be the same.
Please add a unit test if you fix it.

--
David Faure, faure@..., http://www.davidfaure.fr
Sponsored by Nokia to work on KDE, incl. Konqueror (http://www.konqueror.org).

Re: Question about KUrl::equals/cmp...

by Bugzilla from adawit@kde.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 04 November 2009 10:34:25 David Faure wrote:

> On Wednesday 04 November 2009, Dawit A. wrote:
> > Is there a reason why KUrl::equals with its second parameter set to
> > KUrl::CompareWithoutTrailingSlash or KUrl::cmp's second parameter set to
> >  true returns "http://www.kde.org/" and "http://www.kde.org" are not
> > equal to each other ?
>
> Ah, I think I know. Tricky: with http these urls are equal, while with e.g.
> FTP they are not, since ftp://user@host redirects to
>  ftp://user@host/home/user (on a linux server), while ftp://user@host/ is
>  the root dir. Basically we treat "no path" as "please go to the default
>  starting directory". But I'm ok with an exception for HTTP, where indeed
>  people  expect these two urls to be the same.

Well I complely understand the different requirements of different protocols,
but why would one specificy the KUrl::CompareWithoutTrailingSlash flag to make
comparisons, like the ftp case you mentioned above, when the trailing slash is
important ? In other words, why use the KUrl::CompareWithoutTrailingSlash flag
in the first place in such circumstances ?

It does not seem the correct thing to do to make exceptions in the API for a
protocol specific need it already accomodates through existing options unless
of course I am completely missing something. Perhaps not breaking existing
applications that rely on the current behavior ?

Re: Question about KUrl::equals/cmp...

by Bugzilla from faure@kde.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wednesday 04 November 2009, Dawit A. wrote:

> On Wednesday 04 November 2009 10:34:25 David Faure wrote:
> > On Wednesday 04 November 2009, Dawit A. wrote:
> > > Is there a reason why KUrl::equals with its second parameter set to
> > > KUrl::CompareWithoutTrailingSlash or KUrl::cmp's second parameter set
> > > to true returns "http://www.kde.org/" and "http://www.kde.org" are not
> > > equal to each other ?
> >
> > Ah, I think I know. Tricky: with http these urls are equal, while with
> > e.g. FTP they are not, since ftp://user@host redirects to
> >  ftp://user@host/home/user (on a linux server), while ftp://user@host/ is
> >  the root dir. Basically we treat "no path" as "please go to the default
> >  starting directory". But I'm ok with an exception for HTTP, where indeed
> >  people  expect these two urls to be the same.
>
> Well I complely understand the different requirements of different
>  protocols, but why would one specificy the
>  KUrl::CompareWithoutTrailingSlash flag to make comparisons, like the ftp
>  case you mentioned above, when the trailing slash is important ? In other
>  words, why use the KUrl::CompareWithoutTrailingSlash flag in the first
>  place in such circumstances ?

KDirLister doesn't know about the "no path special case". It simply wants to
know that, when it's told about /foo/bar/, and it knows about /foo/bar, they
are the same item. So KDirLister does all directory-url comparisons with
"CompareWithoutTrailingSlash". Well, technically it uses more often
adjustPath(removeTrailingSlash), but same idea: that one also doesn't remove
the single slash in case the path is just a '/'.

I gave FTP as an example above, now let me give "file" as an example.
Would you say that the local path "/" is equal to the local path ""? I would
say no. The first one is the root dir, the second one is an empty path (which
in the case of local files, doesn't represent any file or directory).

I think they are equivalent only for protocols that don't know anything about
directories, like http :-). (But in kde4 we moved away from making KUrl depend
on KProtocolInfo, so no way to write generic solutions like the previous
sentence.)

> It does not seem the correct thing to do to make exceptions in the API for
>  a protocol specific need it already accomodates through existing options
>  unless of course I am completely missing something. Perhaps not breaking
>  existing applications that rely on the current behavior ?

Well, it would not seem correct to have to make exceptions in KDirLister and
other places which compare paths often :-)

I'm a bit surprised that this comes up only now. What do you need this
comparison for, and how is the same problem handled in the current khtml/konq
code? (explicit check for empty path (on http) and making it "/", iirc,
somewhere?)

--
David Faure, faure@..., http://www.davidfaure.fr
Sponsored by Nokia to work on KDE, incl. Konqueror (http://www.konqueror.org).

Re: Question about KUrl::equals/cmp...

by Bugzilla from adawit@kde.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thursday 05 November 2009 05:17:01 David Faure wrote:

> On Wednesday 04 November 2009, Dawit A. wrote:
> > On Wednesday 04 November 2009 10:34:25 David Faure wrote:
> > > On Wednesday 04 November 2009, Dawit A. wrote:
> > > > Is there a reason why KUrl::equals with its second parameter set to
> > > > KUrl::CompareWithoutTrailingSlash or KUrl::cmp's second parameter set
> > > > to true returns "http://www.kde.org/" and "http://www.kde.org" are
> > > > not equal to each other ?
> > >
> > > Ah, I think I know. Tricky: with http these urls are equal, while with
> > > e.g. FTP they are not, since ftp://user@host redirects to
> > >  ftp://user@host/home/user (on a linux server), while ftp://user@host/
> > > is the root dir. Basically we treat "no path" as "please go to the
> > > default starting directory". But I'm ok with an exception for HTTP,
> > > where indeed people  expect these two urls to be the same.
> >
> > Well I complely understand the different requirements of different
> >  protocols, but why would one specificy the
> >  KUrl::CompareWithoutTrailingSlash flag to make comparisons, like the ftp
> >  case you mentioned above, when the trailing slash is important ? In
> > other words, why use the KUrl::CompareWithoutTrailingSlash flag in the
> > first place in such circumstances ?
>
> KDirLister doesn't know about the "no path special case". It simply wants
>  to know that, when it's told about /foo/bar/, and it knows about /foo/bar,
>  they are the same item. So KDirLister does all directory-url comparisons
>  with "CompareWithoutTrailingSlash". Well, technically it uses more often
>  adjustPath(removeTrailingSlash), but same idea: that one also doesn't
>  remove the single slash in case the path is just a '/'.
>
> I gave FTP as an example above, now let me give "file" as an example.
> Would you say that the local path "/" is equal to the local path ""? I
>  would say no. The first one is the root dir, the second one is an empty
>  path (which in the case of local files, doesn't represent any file or
>  directory).
>
> I think they are equivalent only for protocols that don't know anything
>  about directories, like http :-). (But in kde4 we moved away from making
>  KUrl depend on KProtocolInfo, so no way to write generic solutions like
>  the previous sentence.)
>
> > It does not seem the correct thing to do to make exceptions in the API
> > for a protocol specific need it already accomodates through existing
> > options unless of course I am completely missing something. Perhaps not
> > breaking existing applications that rely on the current behavior ?
>
> Well, it would not seem correct to have to make exceptions in KDirLister
>  and other places which compare paths often :-)

Well that is the dilemma. Here is the problem in KUrl as things are right
now...

#1. If you use KUrl::equals and KUrl::cmp, the comparion will fail if the urls
differ from each other with only a single "/" even if you specify the
"KUrl::CompareWithoutTrailingSlash" option. In the least that should be
documented so that people using those function know what to expect.

#2. #1 is made even worse by the fact that convenience function urlcmp takes
the very same option and returns the very result you excpect which makes the
entire option inconsistent. Very very confusing unless you read the code and
discover the problem.
 
> I'm a bit surprised that this comes up only now. What do you need this
> comparison for, and how is the same problem handled in the current
>  khtml/konq code? (explicit check for empty path (on http) and making it
>  "/", iirc, somewhere?)

I was working on a patch for konqueror which I posted in kfm-devel. I worked
around the issue by using urlcmp.

See http://lists.kde.org/?l=kfm-devel&m=125734929920038&w=2...

I am sure I am not the only one that thought KUrl::CompareWithoutTrailingSlash
means exactly that without exception and that could easily introduce
unintended bugs if people make the assumption and did not happen to test for
the different cases like I did. For example, in kdelibs this is used in khtml
and kpac. Not sure those who wrote the code were aware of this exception...

Re: Question about KUrl::equals/cmp...

by Maksim Orlovich-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I'm a bit surprised that this comes up only now. What do you need this
> comparison for, and how is the same problem handled in the current
> khtml/konq
> code? (explicit check for empty path (on http) and making it "/", iirc,
> somewhere?)

As Dawit mentioned, KHTML uses urlcmp. The use there is basically to
determine whether a jumped to URL is an anchor jump in the same page or
request to load an another one. Actually, I found the behavior wrt to the
/ different between Firefox and IE.

Though, I do recall seeing a bug report about a website where
http://foo/bar and http://foo/bar/ were different, and we screwed up
because we assumed they were the same. As such, I think the
KUrl::CompareWithoutTrailingSlash in Dawit's patch is incorrect.

... And that would be bug 166389 and
http://lists.kde.org/?l=kde-commits&m=122124631601890&w=2

http://lists.kde.org/?l=kde-commits&m=121498845711981&w=2 also seems
relevant to this discussion.

-Maks



Re: Question about KUrl::equals/cmp...

by Bugzilla from adawit@kde.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thursday 05 November 2009 13:39:47 Maksim Orlovich wrote:

> > I'm a bit surprised that this comes up only now. What do you need this
> > comparison for, and how is the same problem handled in the current
> > khtml/konq
> > code? (explicit check for empty path (on http) and making it "/", iirc,
> > somewhere?)
>
> As Dawit mentioned, KHTML uses urlcmp. The use there is basically to
> determine whether a jumped to URL is an anchor jump in the same page or
> request to load an another one. Actually, I found the behavior wrt to the
> / different between Firefox and IE.
>
> Though, I do recall seeing a bug report about a website where
> http://foo/bar and http://foo/bar/ were different, and we screwed up
> because we assumed they were the same. As such, I think the
> KUrl::CompareWithoutTrailingSlash in Dawit's patch is incorrect.
>
> ... And that would be bug 166389 and
> http://lists.kde.org/?l=kde-commits&m=122124631601890&w=2
>
> http://lists.kde.org/?l=kde-commits&m=121498845711981&w=2 also seems
> relevant to this discussion.

ahhhh... very interesting. I will remove that option from the patch then. It
is okay to have duplicate items in history list to make sure we avoid
regressions of bug 166389.

The issue with KUrl however remains because the option name is confusing to
say the least and that same option produces different results in urlcmp.

Dawit a.