Hi, All!
I've got a database containing single table with a dozen of columns and some
indices like that:
CREATE TABLE T(X, Y, Z, ..., UNIQUE (X, Y, Z, ...) ON CONFLICT IGNORE);
CREATE INDEX IX ON T(X);
CREATE INDEX IY ON T(Y);
CREATE INDEX IZ ON T(Z);
.....
Data in the different columns are small positive integers, 32-bit integers
or few-bytes blobs. Neither of the X, Y, Z, ... are unique by themselves,
only their combination is unique (that's why I opt using relational database
to process them). My application treats this table as append-only, that is,
doing either INSERT or SELECT on it, without any UPDATEs. Few millions of
inserts are wrapped in a single transaction in order to reduce journal usage.
Total number of records in the table is more than 10^10, so it doesn't fit
in any RAM. PRAGMA CACHE_SIZE is adjusted to use most of the available memory.
Journal file (of enough size to hold all the pages dirtied by a biggest
transaction) is created in advance and PRAGMA JOURNAL_MODE is set to PERSIST
in order not to bother operating system with creating/deleting files. Page
size is matched to the underlying filesystem block size and to the stripe
size of RAID containing that filesystem. Sqlite version is now 3.7.3, but
it seems that exact version doesn't matter.
When trying to insert data already present in the table, performance is
fairly well, most of the CPU time is spent only on parsing SQL statements
and converting data to internal format, while database operation itself
(that is checking data against the constraint) is almost instantaneous,
which is quite impressive given the table size. But when the application
inserts new data, things change drastically: total throughput drops by a
2-3 orders of magnitude. CPU is staying almost idle, and all time is spent
waiting for disk _reading_ (rarely interspersed with fast and happy write
bursts on each COMMIT). What is sqlite reading there? Does it try to
perfectly balance each index on each insert (million times per
transaction) or something else?
Sincerely,
Valentin Davydov.
_______________________________________________
sqlite-users mailing list
sqlite-users@...
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users