Column as a substring

View: New views
11 Messages — Rating Filter:   Alert me  

Column as a substring

by Unabashed :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello! Please, help me if you have a time for this. I have an sqlite database table:
CREATE TABLE lemma (
  id INTEGER PRIMARY KEY,
  base TEXT,
  preflex_id INTEGER,
  type_ancode TEXT,
  prefix_id INTEGER
)
In the "base" column I store a string which I need to compare with another string, the problem is that "base" is a substring. So, I want to select records from this table where "base" is a substring for some string. I understand, that it's a beginner question, but I know only how to solve inverse problem using "LIKE" :( Thank you for reading it!

Re: Column as a substring

by Jean-Christophe Deschamps :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


>Hello! Please, help me if you have a time for this. I have an sqlite
>database
>table:
>CREATE TABLE lemma (
>   id INTEGER PRIMARY KEY,
>   base TEXT,
>   preflex_id INTEGER,
>   type_ancode TEXT,
>   prefix_id INTEGER
>)
>In the "base" column I store a string which I need to compare with another
>string, the problem is that "base" is a substring. So, I want to select
>records from this table where "base" is a substring for some string. I
>understand, that it's a beginner question, but I know only how to solve
>inverse problem using "LIKE" :( Thank you for reading it!

Try ... LIKE !

select * from lemma where 'Whoever you love' like '%' || base || '%';

If a row contains base = 'Eve' it should turn up (just an example).



_______________________________________________
sqlite-users mailing list
sqlite-users@...
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: Column as a substring

by stercor :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, 08 Nov 2009 15:06:24 +0100
Jean-Christophe Deschamps <jcd@...> wrote:

> select * from lemma where 'Whoever you love' like '%' || base || '%';

I first tried that with this SQL statement:

SELECT *
  FROM fm
 WHERE name LIKE '%Juiian%'
   AND info LIKE '%' || name || '%'
 ORDER by name;

returned 0 rows.

Second, substituting OR for AND:

SELECT *
  FROM fm
 WHERE name LIKE '%Juiian%'
    OR info LIKE '%' || name || '%'
 ORDER by name;

returned 220 rows.

Third, my oriiginal inquiry:

SELECT *
  FROM fm
 WHERE name LIKE '%Juiian%'
    OR info LIKE '%Julian%'
 ORDER by name;

returned the (correct) 6 rows.

What am I missing?  I'd really like for the second example to work like
the third.

Ted
_______________________________________________
sqlite-users mailing list
sqlite-users@...
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: Column as a substring

by Jean-Christophe Deschamps :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


>Third, my oriiginal inquiry:
>
>SELECT *
>   FROM fm
>  WHERE name LIKE '%Juiian%'
>     OR info LIKE '%Julian%'
>  ORDER by name;
>
>returned the (correct) 6 rows.

BTW, if you or anyone else need a fuzzy compare function I have one
that can help.  It works with internally Unicode-unaccented versions of
string and pattern.  It also accept wildcards (but no heading '%').  It
returns a Damerau-Levenshtein distance, so it's O(n*m) and therefore
should only be used on _short_ strings (like names, cities).

This is part of an extension library, along with Unicode-unaccented
versions of LIKE and GLOB and a handful of similar Unicode strings
functions.  It doesn't use ICU but internal Unicode v5.1 tries.

It's still in a beta stage and I'd like to get it tested by people
using various languages to know how it behaves and also to surface bugs
and issues.

It was written for Windows and the collations function use one Windows
call.  For those that don't like that, it should compile easily without
(untested).

Drop me a note if you're willing to try it.



_______________________________________________
sqlite-users mailing list
sqlite-users@...
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: Column as a substring

by Igor Tandetnik :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ted Rolle wrote:

> On Sun, 08 Nov 2009 15:06:24 +0100
> Jean-Christophe Deschamps <jcd@...> wrote:
>
>> select * from lemma where 'Whoever you love' like '%' || base || '%';
>
> I first tried that with this SQL statement:
>
> SELECT *
>  FROM fm
> WHERE name LIKE '%Juiian%'
>   AND info LIKE '%' || name || '%'
> ORDER by name;
>
> returned 0 rows.

Did you perhaps mean 'Julian' rathen than 'Juiian' ?

Igor Tandetnik


_______________________________________________
sqlite-users mailing list
sqlite-users@...
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: Column as a substring

by stercor :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, 8 Nov 2009 10:31:05 -0500
"Igor Tandetnik" <itandetnik@...> wrote:

> > SELECT *
> >  FROM fm
> > WHERE name LIKE '%Juiian%'
> >   AND info LIKE '%' || name || '%'
> > ORDER by name;
> >
Thanks for seeing the oversight.
When I use

SELECT *
 FROM fm
WHERE name LIKE '%Julian%'
   OR info LIKE '%' || name || '%'
ORDER by name;

it returns 224 columns.

Perhaps I'm misunderstanding the concept.
It could be returning 5 columns with 'Julian'
AND all columns where column 'name' is LIKE column info.
That would be a whole lot more.

Ted
_______________________________________________
sqlite-users mailing list
sqlite-users@...
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: Column as a substring

by P Kishor-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Nov 8, 2009 at 10:20 AM, Ted Rolle <stercor@...> wrote:

> On Sun, 8 Nov 2009 10:31:05 -0500
> "Igor Tandetnik" <itandetnik@...> wrote:
>
>> > SELECT *
>> >  FROM fm
>> > WHERE name LIKE '%Juiian%'
>> >   AND info LIKE '%' || name || '%'
>> > ORDER by name;
>> >
> Thanks for seeing the oversight.
> When I use
>
> SELECT *
>  FROM fm
> WHERE name LIKE '%Julian%'
>   OR info LIKE '%' || name || '%'
> ORDER by name;
>
> it returns 224 columns.
>
> Perhaps I'm misunderstanding the concept.
> It could be returning 5 columns with 'Julian'
> AND all columns where column 'name' is LIKE column info.
> That would be a whole lot more.
>

You are misunderstanding the concept. The boolean AND works
counter-intuitively from English... it actually makes the basket
smaller. AND returns the values for which both conditions on either
side of AND are true. What you are looking for is OR, which will
return a bigger basket.



> Ted
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@...
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



--
Puneet Kishor http://www.punkish.org
Carbon Model http://carbonmodel.org
Charter Member, Open Source Geospatial Foundation http://www.osgeo.org
Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor
Nelson Institute, UW-Madison http://www.nelson.wisc.edu
-----------------------------------------------------------------------
Assertions are politics; backing up assertions with evidence is science
=======================================================================
_______________________________________________
sqlite-users mailing list
sqlite-users@...
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: Column as a substring

by Unabashed :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Jean-Christophe Deschamps wrote:
Try ... LIKE !
select * from lemma where 'Whoever you love' like '%' || base || '%';
If a row contains base = 'Eve' it should turn up (just an example).

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
The problem is that this query receives all data from the table. I don't know why.
p.s. data is in cp1251

Re: Column as a substring

by Unabashed :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I think that my problem is in using LIKE expression for non-ascii strings. Database encode is UTF-8. When table data in the "base" column (see my first message for structure) consists of english symbols (ascii) LIKE works correct, but when I'm trying to execute it on strings consists of UTF8 symbols non compatible with ascii it doesn't work.
For example, if i'll have column "col1"
"col1 LIKE 'asciistring'" and "col1 LIKE '%asciistring%'" will return correct data, but
"col1 LIKE 'nonasciistring'" receives all data from the table where length of data in col1 is similar to nonasciistring and
"col1 LIKE '%nonasciistring%'" receives all data from the table.
If someone knows how to solve it, please hrlp me.

Re: Column as a substring

by Igor Tandetnik :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ted Rolle wrote:

> SELECT *
> FROM fm
> WHERE name LIKE '%Julian%'
>   OR info LIKE '%' || name || '%'
> ORDER by name;
>
> it returns 224 columns.
>
> Perhaps I'm misunderstanding the concept.
> It could be returning 5 columns with 'Julian'
> AND all columns where column 'name' is LIKE column info.
> That would be a whole lot more.

It returns all rows where either name LIKE '%Julian%', OR info LIKE '%' || name || '%'.

If you replace OR with AND, the query would return all rows where both name LIKE '%Julian%' AND info LIKE '%' || name || '%'. Naturally, this latter set is no larger than the former.

Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@...
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: Column as a substring

by Jean-Christophe Deschamps :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


>I think that my problem is in using LIKE expression for non-ascii strings.
>Database encode is UTF-8. When table data in the "base" column (see my
>first
>message for structure) consists of english symbols (ascii) LIKE works
>correct, but when I'm trying to execute it on strings consists of UTF8
>symbols non compatible with ascii it doesn't work.

As far as I understand it (please correct if this is wrong), SQLite
will store/retrieve data "as supplied".  If you hand it ANSI strings,
you get them stored verbatim and can retrieve them verbatim.

But and that's a big BUT without pun, if you use any function acting on
data, just as LIKE, it reads UTF-* data (replace * by the encoding you
used for creating the database) and _replace badly formed UTF-*
characters by the Unicode u+FFFD (invalid codepoint) marker.  From then
on, the changed data no longer compares with the original string since
some characters are destroyed in the process.  I suppose indexing with
anything else than the default BINARY collation is likely to produce
erroneous results.

The solution is to build a true UTF-8 (or UTF-16) database.  The
simplest way could be (again correction is welcome) to rebuild the base
from the original data, once it is converted to valid UTF-* suitable
for input.  If your only data source now is the current base itself
(input data no more available) you could try to dump the base with the
command line utility and work from there.





_______________________________________________
sqlite-users mailing list
sqlite-users@...
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users