AND search finished, request for review

View: New views
9 Messages — Rating Filter:   Alert me  

AND search finished, request for review

by Georg Rehfeld :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Bharat, Mindless, all,

finally I was able to complete the AND search capability for G2. Phew!

Please find the patch as text, compressed archive and zipped at:

   http://georg-rehfeld.de/test/and_search_patch.txt
   http://georg-rehfeld.de/test/and_search_patch.tgz
   http://georg-rehfeld.de/test/and_search_patch.zip

I ask for a review of the patch soon. I kept this change working since a
year, resolving all conflicts coming in in the meantime. This was due to
lack of time to finish the missing tests. You might imagine my pain not
beeing allowed to commit until the unit tests were finished, while the
feature itself worked well.

The unit tests are now in!

Please let me commit asap to get rid of these 'outgoing' changes!

+ What the AND search does

- it splits search input on white space except for quoted phrases, which
   are preserved including white space
- all entered strings or quoted phrases must be present in the searched
   field
- the search strings/phrases may be present anywhere in the field
- overlapping of strings is allowed and respected/highlighted in the
   search result pages

+ Samples

Assume an entry like the following in some searchable field:

   'The hottest girl in the german town Ascheffel shot a fox.'

If you wanna find items of a 'hot girl' (:-)) the existing G2 search
would not find this item. It only looks for the phrase 'hot girl', which
isn't present in the field.

The AND search finds that item. And it will highlight the found strings
correctly ('hot' in 2 places, 'girl' in 1 place).

Search for 'hot shot' and see the emphasis of the 'hot' inside the
'shot' in the search results.

Search for 'scheffel asche' and see the overlapping support in the word
'Ascheffel': the overlapping string 'sche' is again emphasized.

+ Implemetation hints

I replaced the javascript fix of bug 1422405 by a corresponding server
side solution with a new smarty modifier 'highlight'. The rest of my
changes are just straight forward.

Greetings, Georg/xlerb/deerwood
  ___   ___
| + | |__    Georg Rehfeld      Woltmanstr. 12     20097 Hamburg
|_|_\ |___   rehfeld@...       +49 (40) 23 53 27 10

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
__[ g a l l e r y - d e v e l ]_________________________

[ list info/archive --> http://gallery.sf.net/lists.php ]
[ gallery info/FAQ/download --> http://gallery.sf.net ]

Re: AND search finished, request for review

by Georg Rehfeld :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Good morning,

I wrote:
 > Hi Bharat, Mindless, all,
 >
 > finally I was able to complete the AND search capability for G2. Phew!
 >
 > Please find the patch as text ... at:
 >
 >    http://georg-rehfeld.de/test/and_search_patch.txt

Nobody answered, I assume the patch is commonly accepted and of minor
interest. But the enhanced search is important for me, so I will commit
the changes in 2 days, unless someone complains.

 From the fact, that several people have downloaded the mentioned patch
from the (non public) location, I know, some have seen it and had a
chance to opt that out.

> Hi Bharat, Mindless, all,
>
> finally I was able to complete the AND search capability for G2. Phew!
>
> Please find the patch as text, compressed archive and zipped at:
>
>    http://georg-rehfeld.de/test/and_search_patch.txt
>    http://georg-rehfeld.de/test/and_search_patch.tgz
>    http://georg-rehfeld.de/test/and_search_patch.zip
>
> I ask for a review of the patch soon. I kept this change working since a
> year, resolving all conflicts coming in in the meantime. This was due to
> lack of time to finish the missing tests. You might imagine my pain not
> beeing allowed to commit until the unit tests were finished, while the
> feature itself worked well.
>
> The unit tests are now in!
>
> Please let me commit asap to get rid of these 'outgoing' changes!
>
> + What the AND search does
>
> - it splits search input on white space except for quoted phrases, which
>    are preserved including white space
> - all entered strings or quoted phrases must be present in the searched
>    field
> - the search strings/phrases may be present anywhere in the field
> - overlapping of strings is allowed and respected/highlighted in the
>    search result pages
>
> + Samples
>
> Assume an entry like the following in some searchable field:
>
>    'The hottest girl in the german town Ascheffel shot a fox.'
>
> If you wanna find items of a 'hot girl' (:-)) the existing G2 search
> would not find this item. It only looks for the phrase 'hot girl', which
> isn't present in the field.
>
> The AND search finds that item. And it will highlight the found strings
> correctly ('hot' in 2 places, 'girl' in 1 place).
>
> Search for 'hot shot' and see the emphasis of the 'hot' inside the
> 'shot' in the search results.
>
> Search for 'scheffel asche' and see the overlapping support in the word
> 'Ascheffel': the overlapping string 'sche' is again emphasized.
>
> + Implemetation hints
>
> I replaced the javascript fix of bug 1422405 by a corresponding server
> side solution with a new smarty modifier 'highlight'. The rest of my
> changes are just straight forward.

regards, Georg
  ___   ___
| + | |__    Georg Rehfeld      Woltmanstr. 12     20097 Hamburg
|_|_\ |___   rehfeld@...       +49 (40) 23 53 27 10

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
__[ g a l l e r y - d e v e l ]_________________________

[ list info/archive --> http://gallery.sf.net/lists.php ]
[ gallery info/FAQ/download --> http://gallery.sf.net ]

Re: [Gallery-core] AND search finished, request for review

by Bharat Mediratta :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Georg Rehfeld wrote:

> Good morning,
>
> I wrote:
>  > Hi Bharat, Mindless, all,
>  >
>  > finally I was able to complete the AND search capability for G2. Phew!
>  >
>  > Please find the patch as text ... at:
>  >
>  >    http://georg-rehfeld.de/test/and_search_patch.txt
>
> Nobody answered, I assume the patch is commonly accepted and of minor
> interest. But the enhanced search is important for me, so I will commit
> the changes in 2 days, unless someone complains.

Please hold off on committing this until either I, Alan or Andy can
review it.  I've got a pretty big backlog, not sure about Alan and Andy
.. but my guess is that we're going to need a little time to get around
to it.  Regular reminders wouldn't go amiss :-)

-Bharat

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
__[ g a l l e r y - d e v e l ]_________________________

[ list info/archive --> http://gallery.sf.net/lists.php ]
[ gallery info/FAQ/download --> http://gallery.sf.net ]

Re: [Gallery-core] AND search finished, request for review

by Georg Rehfeld :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi all,

- updated all DB tests to not expect case insensivity searches any more
- added some more tests

> Please find the patch as text, compressed archive and zipped at:
>
>    http://georg-rehfeld.de/test/and_search_patch.txt
>    http://georg-rehfeld.de/test/and_search_patch.tgz
>    http://georg-rehfeld.de/test/and_search_patch.zip

regards, Georg
  ___   ___
| + | |__    Georg Rehfeld      Woltmanstr. 12     20097 Hamburg
|_|_\ |___   rehfeld@...       +49 (40) 23 53 27 10

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
__[ g a l l e r y - d e v e l ]_________________________

[ list info/archive --> http://gallery.sf.net/lists.php ]
[ gallery info/FAQ/download --> http://gallery.sf.net ]

Re: [Gallery-core] AND search finished, request for review

by Georg Rehfeld :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Bharat, hi all,

Bharat Mediratta wrote:
>> But the enhanced search is important for me, so I will commit
>> the changes in 2 days, unless someone complains.
>
> Please hold off on committing this until either I, Alan or Andy can
> review it.

Sure! I wouldn't have committed 21 changed files anyway without your
confirmation. I was kidding (placing an emphasized reminder). But note:
I find it hard to contribute to G2.

>  I've got a pretty big backlog, not sure about Alan and Andy
> .. but my guess is that we're going to need a little time to get around
> to it.  Regular reminders wouldn't go amiss :-)

Alan some days ago said on IRC something like 'hey, no more jobs in my
G2 queue' :-) .


+ The Importance of Enhanced Search

The 'minor interest' might still hold for users/admins of G2 using it
for some few family/vacation photos. But aware admins/users can take a
lot more out of the enhanced search.

In my case the whole IPTC thingy was done specifically to make a
Gallery with thousands of items searchable. The use cases are:

- Andy (admin) gets 100 photos all shot at a specific (cyclist) race
   once a month.
- Andy selects 10...30 of them and prepares them in Photoshop (size,
   colors, quality).
- Andy adds important IPTC data actively with some application
   (e.g. Photoshop File Browser / Bridge)
   - Andy selects all photos and adds the IPTC title naming the
     race name, location and year/date.
   - Andy selects the 5 photos showing the race winner and adds the
     winners name to the IPTC/keywords.
   - Andy selects the X photos showing other interesting peoples and adds
     their name to the IPTC/keywords. He repeats that with different
     selections and names, until finished.
   - Andy selects a bunch of other photos having common attributes
     (pictures of the START/FINISH of the race, pictures taken in winter/
     rain / sun / a specific town, whatever) and adds appropriate
     keywords to the selection.
- Andy uploads/imports the 10..30 files into the web gallery.
- Andy expects all his manually entered information and the
   corresponding images to be findable.


- Victor (viewer/user) jumps in just looking at some pictures.
- Victor sees a picture of some interesting person.
- Victor recognizes the meta data (IPTC) naming the person shown.
- Victor tries a search on the persons name, partially or completely.
- Victor gets back a list of hits, attracting him to inspect further.
- Victor recognizes additional meta data (another person, race name
   etc.) and tries another search on that (say the race name).
- Victor gets success again, 10..30 pictures of the race, but showing
   many different persons.
- Victor combines his 2 search requests and ... voila ... gets the
   person of interest in the named race.

Note that the IPTC keywords are very similar to the popular tagging
mechanismns.

best regards, Georg
  ___   ___
| + | |__    Georg Rehfeld      Woltmanstr. 12     20097 Hamburg
|_|_\ |___   rehfeld@...       +49 (40) 23 53 27 10



Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
__[ g a l l e r y - d e v e l ]_________________________

[ list info/archive --> http://gallery.sf.net/lists.php ]
[ gallery info/FAQ/download --> http://gallery.sf.net ]

Re: [Gallery-core] AND search finished, request for review

by Alan Harder :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

hi-
i did a quick first-pass review.
looks pretty good, though I didn't examine all the regexp and
array juggling in the SearchUtilities class in any detail.

http://tools.gallery2.org/pastebin/672

please post a new diff after going over this, and we can get
this into svn.
bharat, any comment on the SearchUtilities::doAndSearch
function?  it's currently hardcoded to true.

        - Alan


Georg Rehfeld wrote:

> Hi all,
>
> - updated all DB tests to not expect case insensivity searches any more
> - added some more tests
>
>> Please find the patch as text, compressed archive and zipped at:
>>
>>    http://georg-rehfeld.de/test/and_search_patch.txt
>>    http://georg-rehfeld.de/test/and_search_patch.tgz
>>    http://georg-rehfeld.de/test/and_search_patch.zip
>
> regards, Georg


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
__[ g a l l e r y - d e v e l ]_________________________

[ list info/archive --> http://gallery.sf.net/lists.php ]
[ gallery info/FAQ/download --> http://gallery.sf.net ]

Re: [Gallery-core] AND search finished, request for review

by Bharat Mediratta :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


I did a cursory review of the CL and had some comments (not covering
stuff that Alan already covered-- I think):

* The point of having versioned interfaces is that you can add new
functionality while maintaining backwards compatibility.  If you're
adding a new method to SearchUtilities, this means that all modules who
use it are dependent on the latest search module (else PHP will choke on
the missing method when you call it in GalleryCommentSearch).  So the
way to do this would be to create GallerySearchInterface_1_1.class and
have modules switch to supporting that interface.  Older versions of the
search module won't call these modules because they'll no longer get
served up from the factory.  Newer versions of the search module *will*
call them and can do the and search directly.  If you want a module that
supports both G2.1 and G2.2 you can implement both interfaces.  This
gets rid of  SearchUtilities::doAndSearch and gives you broad compatibility.

* CoreSearchTest::testSearchAllSomeFieldsAnd (and others) shouldn't be
gated on a boolean value; it should find a way to test the code (moot
after you satisfy the above point, but still a good philosophy).

* Nice tests for malicious search queries!

* The last assertion in TestSmartyModifierHighlightWithXmlTags should be
broken up into several smaller ones that verify each individual feature
that you're checking for here. It looks like you've got a lot of
unnecessary extra overhead in there; but if you break it out into
smaller tests it should be obvious if there is extra value in having 8
needles.  This should make it clearer that your change doesn't reopen
bug 1422405 [1], which it looks like it doesn't.

* modules/search/HighlightResults.js can be deleted.  Mark it as R in
the MANIFEST file.


rest looks pretty good to me (modulo Alan's comments).
-Bharat

[1]:
http://sourceforge.net/tracker/index.php?func=detail&aid=1422405&group_id=7130&atid=107130

Alan Harder wrote:

> hi-
> i did a quick first-pass review.
> looks pretty good, though I didn't examine all the regexp and
> array juggling in the SearchUtilities class in any detail.
>
> http://tools.gallery2.org/pastebin/672
>
> please post a new diff after going over this, and we can get
> this into svn.
> bharat, any comment on the SearchUtilities::doAndSearch
> function?  it's currently hardcoded to true.
>
> - Alan
>
>
> Georg Rehfeld wrote:
>> Hi all,
>>
>> - updated all DB tests to not expect case insensivity searches any more
>> - added some more tests
>>
>>> Please find the patch as text, compressed archive and zipped at:
>>>
>>>    http://georg-rehfeld.de/test/and_search_patch.txt
>>>    http://georg-rehfeld.de/test/and_search_patch.tgz
>>>    http://georg-rehfeld.de/test/and_search_patch.zip
>> regards, Georg
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> __[ g a l l e r y - d e v e l ]_________________________
>
> [ list info/archive --> http://gallery.sf.net/lists.php ]
> [ gallery info/FAQ/download --> http://gallery.sf.net ]


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
__[ g a l l e r y - d e v e l ]_________________________

[ list info/archive --> http://gallery.sf.net/lists.php ]
[ gallery info/FAQ/download --> http://gallery.sf.net ]

Re: [Gallery-core] AND search finished, request for review

by Georg Rehfeld :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi all,

many thanks for the review! I will look into all of this and change
accordingly (maybe come back with some questions first) in the near
future. This and next week I can't (my mother died yesterday).

Bharat Mediratta wrote:
>
> I did a cursory review of the CL and had some comments (not covering
> stuff that Alan already covered-- I think):
>
> * The point of having versioned interfaces is that you can add new ...

Best regards, Georg
--
  ___   ___
| + | |__    Georg Rehfeld      Woltmanstr. 12     20097 Hamburg
|_|_\ |___   rehfeld@...       +49 (40) 23 53 27 10

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
__[ g a l l e r y - d e v e l ]_________________________

[ list info/archive --> http://gallery.sf.net/lists.php ]
[ gallery info/FAQ/download --> http://gallery.sf.net ]

Re: [Gallery-core] AND search finished, request for review

by Andy Staudacher :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Georg,

I'm going through a few old emails right now and found this thread about AND
search.
Have there been any developments since mindless reviewed your patch?

Boolean search is now #11 on the ladder of most requested features.
And I find myself pointing users to the sfvote page since they're requesting
better search options pretty often.

Thus I have a few questions:
1. How easy will it be to add OR search once your version of AND search has
been added?
2. How well does it perform?
3. Boolean search is often implemented with mysql's FULLTEXT index features.
Any opinion about that?

Thanks,
 - Andy

Georg Rehfeld wrote:

> Hi all,
>
> many thanks for the review! I will look into all of this and
> change accordingly (maybe come back with some questions
> first) in the near future. This and next week I can't (my
> mother died yesterday).
>
> Bharat Mediratta wrote:
> >
> > I did a cursory review of the CL and had some comments (not
> covering
> > stuff that Alan already covered-- I think):
> >
> > * The point of having versioned interfaces is that you can
> add new ...
>
> Best regards, Georg


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
__[ g a l l e r y - d e v e l ]_________________________

[ list info/archive --> http://gallery.sf.net/lists.php ]
[ gallery info/FAQ/download --> http://gallery.sf.net ]