SQLite3 problem during stress

View: New views
12 Messages — Rating Filter:   Alert me  

SQLite3 problem during stress

by Christian Svensson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello.

We are trying to migrate from bind9 to PowerDNS.
We have our back-end supermaster successfully constructed using PowerDNS + postgresql 8.3 + alsonotify-patch (http://wiki.powerdns.com/trac/ticket/216)
The slaves have to have a very small footprint and therefore we have chosen sqlite3 as our target database.

After combating some weird incompatibility with Debian Lenny 5.0 where PDNSs gsqlite3 refused to write anything to the database it begun to crash when we did the initial transfer (notify of several hundred domains).
I have tried with the OpenDBX backend which seems to work much better regarding the sqlite3 incompatibility but also crashes during heavy transfer load.
Setting the distributor-threads to 1 helps it somewhat, but I think there might be a different thread that messes things up.

Log excerpt:
Jul 18 16:45:40 Reading random entropy from '/dev/urandom'
Jul 18 16:45:40  [OpendbxBackend] This is the opendbx module version 2.9.22 (Jul 18 2009, 14:19:38) reporting
Jul 18 16:45:40  [OpendbxBackend] This is the opendbx module version 2.9.22 (Jul 18 2009, 14:19:28) reporting
Jul 18 16:45:40 This is a standalone pdns
Jul 18 16:45:40 It is advised to bind to explicit addresses with the --local-address option
Jul 18 16:45:40 UDP server bound to 0.0.0.0:53
Jul 18 16:45:40 TCP server bound to 0.0.0.0:53
Jul 18 16:45:40 PowerDNS 2.9.22 (C) 2001-2009 PowerDNS.COM BV (Jul 18 2009, 14:23:39, gcc 4.3.2) starting up
Jul 18 16:45:40 PowerDNS comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it according to the terms of the GPL version 2.
Jul 18 16:45:40 Set effective group id to 104
Jul 18 16:45:40 Set effective user id to 102
Jul 18 16:45:40 DNS Proxy launched, local port 27636, remote 127.0.0.1:1053
Jul 18 16:45:40 Master/slave communicator launching
Jul 18 16:45:40 Launched webserver on 127.0.0.1:8081
Jul 18 16:45:40 Creating backend connection for TCP
% Jul 18 16:45:40 [OpendbxBackend] Database connection (read) to '/srv/pdns/' succeeded
Jul 18 16:45:40 [OpendbxBackend] Database connection (write) to '/srv/pdns/' succeeded
Jul 18 16:45:40 211 slave domains need checking
Jul 18 16:45:40 [OpendbxBackend] Database connection (read) to '/srv/pdns/' succeeded
Jul 18 16:45:40 [OpendbxBackend] Database connection (write) to '/srv/pdns/' succeeded
Jul 18 16:45:40 About to create 1 backend threads for UDP
Jul 18 16:45:40 [OpendbxBackend] Database connection (read) to '/srv/pdns/' succeeded
Jul 18 16:45:40 [OpendbxBackend] Database connection (write) to '/srv/pdns/' succeeded
Jul 18 16:45:40 Done launching threads, ready to distribute questions
Jul 18 16:45:47 Engaging bypass - now operating unthreaded
Jul 18 16:45:47 [OpendbxBackend] Database connection (read) to '/srv/pdns/' succeeded
Jul 18 16:45:47 [OpendbxBackend] Database connection (write) to '/srv/pdns/' succeeded
Jul 18 16:45:47 Received NOTIFY for stadsbudcentralen.nu from 213.132.111.158 for which we are not authoritative
Jul 18 16:45:47 Created new slave zone 'stadsbudcentralen.nu' from supermaster 213.132.111.158, queued axfr
Jul 18 16:45:47 Received NOTIFY for stadsbudcentralen.se from 213.132.111.158 for which we are not authoritative
Jul 18 16:45:47 Initiating transfer of 'stadsbudcentralen.nu' from remote '213.132.111.158'
Jul 18 16:45:47 [OpendbxBackend] Database connection (read) to '/srv/pdns/' succeeded
Jul 18 16:45:47 [OpendbxBackend] Database connection (write) to '/srv/pdns/' succeeded
Jul 18 16:45:47 Created new slave zone 'stadsbudcentralen.se' from supermaster 213.132.111.158, queued axfr
Jul 18 16:45:47 Received NOTIFY for westberg.info from 213.132.111.158 for which we are not authoritative
Jul 18 16:45:47 AXFR started for 'stadsbudcentralen.nu', transaction started
Jul 18 16:45:47 AXFR done for 'stadsbudcentralen.nu', zone committed
Jul 18 16:45:47 Initiating transfer of 'stadsbudcentralen.se' from remote '213.132.111.158'
Jul 18 16:45:47 [OpendbxBackend] Database connection (read) to '/srv/pdns/' succeeded
Jul 18 16:45:47 Created new slave zone 'westberg.info' from supermaster 213.132.111.158, queued axfr
Jul 18 16:45:47 [OpendbxBackend] Database connection (write) to '/srv/pdns/' succeeded
Jul 18 16:45:47 Received NOTIFY for trevlig.nu from 213.132.111.158 for which we are not authoritative
Jul 18 16:45:47 Unable to find backend willing to host trevlig.nu for potential supermaster 213.132.111.158
Jul 18 16:45:47 Received NOTIFY for drproduction.com from 213.132.111.158 for which we are not authoritative
Jul 18 16:45:47 Unable to find backend willing to host drproduction.com for potential supermaster 213.132.111.158
Jul 18 16:45:47 Received NOTIFY for floopy.be from 213.132.111.158 for which we are not authoritative
Jul 18 16:45:47 Unable to find backend willing to host floopy.be for potential supermaster 213.132.111.158
Jul 18 16:45:47 Received NOTIFY for gronastuganikalmar.se from 213.132.111.158 for which we are not authoritative
Jul 18 16:45:47 Unable to find backend willing to host gronastuganikalmar.se for potential supermaster 213.132.111.158
Jul 18 16:45:47 Received NOTIFY for kigsrdr.org from 213.132.111.158 for which we are not authoritative
Jul 18 16:45:47 [OpendbxBackend] getRecord: Unable to get next result - database is locked
Jul 18 16:45:47 Communicator thread died because of error: Error: odbx_result() failed

Jul 18 16:45:47 Unable to find backend willing to host kigsrdr.org for potential supermaster 213.132.111.158
Jul 18 16:45:47 Received NOTIFY for kalmargamecenter.com from 213.132.111.158 for which we are not authoritative
Jul 18 16:45:47 Unable to find backend willing to host kalmargamecenter.com for potential supermaster 213.132.111.158
Jul 18 16:45:47 Received NOTIFY for larvig.nu from 213.132.111.158 for which we are not authoritative
Jul 18 16:45:47 Unable to find backend willing to host larvig.nu for potential supermaster 213.132.111.158
Jul 18 16:45:47 Received NOTIFY for lyktan-vilshult.se from 213.132.111.158 for which we are not authoritative

This is an example where it happens quite early. Sometimes it gets a couple of hundred domains before it crashes.
Tried with both gsqlite3 and opendbx.

Note: I have only been able to test the distributor-threads = 1 with opendbx. But I would assume that option and thus the error is backend agnostic in that it only requires the backend to handle one write/read connection.

Please advice.

Christian "BC" Svensson
Codelead Systems - http://www.codelead.se

_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

Re: SQLite3 problem during stress

by bert hubert-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Very quick reply without thinking, make sure PowerDNS has write
permissions on the *directory* containing the sqlite3 database - this
tends to trip people up 'it can write to the file' is not enough.

I'll read your message more carefully to see what else might be happening.

    Bert

On Sat, Jul 18, 2009 at 6:51 PM, Christian Svensson<blue@...> wrote:

> Hello.
>
> We are trying to migrate from bind9 to PowerDNS.
> We have our back-end supermaster successfully constructed using PowerDNS +
> postgresql 8.3 + alsonotify-patch (http://wiki.powerdns.com/trac/ticket/216)
> The slaves have to have a very small footprint and therefore we have chosen
> sqlite3 as our target database.
>
> After combating some weird incompatibility with Debian Lenny 5.0 where PDNSs
> gsqlite3 refused to write anything to the database it begun to crash when we
> did the initial transfer (notify of several hundred domains).
> I have tried with the OpenDBX backend which seems to work much better
> regarding the sqlite3 incompatibility but also crashes during heavy transfer
> load.
> Setting the distributor-threads to 1 helps it somewhat, but I think there
> might be a different thread that messes things up.
>
> Log excerpt:
> Jul 18 16:45:40 Reading random entropy from '/dev/urandom'
> Jul 18 16:45:40  [OpendbxBackend] This is the opendbx module version 2.9.22
> (Jul 18 2009, 14:19:38) reporting
> Jul 18 16:45:40  [OpendbxBackend] This is the opendbx module version 2.9.22
> (Jul 18 2009, 14:19:28) reporting
> Jul 18 16:45:40 This is a standalone pdns
> Jul 18 16:45:40 It is advised to bind to explicit addresses with the
> --local-address option
> Jul 18 16:45:40 UDP server bound to 0.0.0.0:53
> Jul 18 16:45:40 TCP server bound to 0.0.0.0:53
> Jul 18 16:45:40 PowerDNS 2.9.22 (C) 2001-2009 PowerDNS.COM BV (Jul 18 2009,
> 14:23:39, gcc 4.3.2) starting up
> Jul 18 16:45:40 PowerDNS comes with ABSOLUTELY NO WARRANTY. This is free
> software, and you are welcome to redistribute it according to the terms of
> the GPL version 2.
> Jul 18 16:45:40 Set effective group id to 104
> Jul 18 16:45:40 Set effective user id to 102
> Jul 18 16:45:40 DNS Proxy launched, local port 27636, remote 127.0.0.1:1053
> Jul 18 16:45:40 Master/slave communicator launching
> Jul 18 16:45:40 Launched webserver on 127.0.0.1:8081
> Jul 18 16:45:40 Creating backend connection for TCP
> % Jul 18 16:45:40 [OpendbxBackend] Database connection (read) to
> '/srv/pdns/' succeeded
> Jul 18 16:45:40 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:40 211 slave domains need checking
> Jul 18 16:45:40 [OpendbxBackend] Database connection (read) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:40 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:40 About to create 1 backend threads for UDP
> Jul 18 16:45:40 [OpendbxBackend] Database connection (read) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:40 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:40 Done launching threads, ready to distribute questions
> Jul 18 16:45:47 Engaging bypass - now operating unthreaded
> Jul 18 16:45:47 [OpendbxBackend] Database connection (read) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 Received NOTIFY for stadsbudcentralen.nu from
> 213.132.111.158 for which we are not authoritative
> Jul 18 16:45:47 Created new slave zone 'stadsbudcentralen.nu' from
> supermaster 213.132.111.158, queued axfr
> Jul 18 16:45:47 Received NOTIFY for stadsbudcentralen.se from
> 213.132.111.158 for which we are not authoritative
> Jul 18 16:45:47 Initiating transfer of 'stadsbudcentralen.nu' from remote
> '213.132.111.158'
> Jul 18 16:45:47 [OpendbxBackend] Database connection (read) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 Created new slave zone 'stadsbudcentralen.se' from
> supermaster 213.132.111.158, queued axfr
> Jul 18 16:45:47 Received NOTIFY for westberg.info from 213.132.111.158 for
> which we are not authoritative
> Jul 18 16:45:47 AXFR started for 'stadsbudcentralen.nu', transaction started
> Jul 18 16:45:47 AXFR done for 'stadsbudcentralen.nu', zone committed
> Jul 18 16:45:47 Initiating transfer of 'stadsbudcentralen.se' from remote
> '213.132.111.158'
> Jul 18 16:45:47 [OpendbxBackend] Database connection (read) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 Created new slave zone 'westberg.info' from supermaster
> 213.132.111.158, queued axfr
> Jul 18 16:45:47 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 Received NOTIFY for trevlig.nu from 213.132.111.158 for
> which we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host trevlig.nu for
> potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for drproduction.com from 213.132.111.158
> for which we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host drproduction.com for
> potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for floopy.be from 213.132.111.158 for which
> we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host floopy.be for
> potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for gronastuganikalmar.se from
> 213.132.111.158 for which we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host gronastuganikalmar.se
> for potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for kigsrdr.org from 213.132.111.158 for
> which we are not authoritative
> Jul 18 16:45:47 [OpendbxBackend] getRecord: Unable to get next result -
> database is locked
> Jul 18 16:45:47 Communicator thread died because of error: Error:
> odbx_result() failed
> Jul 18 16:45:47 Unable to find backend willing to host kigsrdr.org for
> potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for kalmargamecenter.com from
> 213.132.111.158 for which we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host kalmargamecenter.com
> for potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for larvig.nu from 213.132.111.158 for which
> we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host larvig.nu for
> potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for lyktan-vilshult.se from 213.132.111.158
> for which we are not authoritative
>
> This is an example where it happens quite early. Sometimes it gets a couple
> of hundred domains before it crashes.
> Tried with both gsqlite3 and opendbx.
>
> Note: I have only been able to test the distributor-threads = 1 with
> opendbx. But I would assume that option and thus the error is backend
> agnostic in that it only requires the backend to handle one write/read
> connection.
>
> Please advice.
>
> Christian "BC" Svensson
> Codelead Systems - http://www.codelead.se
> !DSPAM:4a61fdc1156835958412055!
> _______________________________________________
> Pdns-users mailing list
> Pdns-users@...
> http://mailman.powerdns.com/mailman/listinfo/pdns-users
>
>
> !DSPAM:4a61fdc1156835958412055!
>
>
_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

Re: SQLite3 problem during stress

by Christian Svensson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

It is able to write to the database a couple of times - and I know it has permissions to both write to the file and create it's journal file in the same directory. Heck, the directory and files are owned by pdns so it can do whatever it desires.

Christian "BC" Svensson
Codelead Systems - http://www.codelead.se


On Sat, Jul 18, 2009 at 7:10 PM, bert hubert <bert.hubert@...> wrote:
Very quick reply without thinking, make sure PowerDNS has write
permissions on the *directory* containing the sqlite3 database - this
tends to trip people up 'it can write to the file' is not enough.

I'll read your message more carefully to see what else might be happening.

   Bert

On Sat, Jul 18, 2009 at 6:51 PM, Christian Svensson<blue@...> wrote:
> Hello.
>
> We are trying to migrate from bind9 to PowerDNS.
> We have our back-end supermaster successfully constructed using PowerDNS +
> postgresql 8.3 + alsonotify-patch (http://wiki.powerdns.com/trac/ticket/216)
> The slaves have to have a very small footprint and therefore we have chosen
> sqlite3 as our target database.
>
> After combating some weird incompatibility with Debian Lenny 5.0 where PDNSs
> gsqlite3 refused to write anything to the database it begun to crash when we
> did the initial transfer (notify of several hundred domains).
> I have tried with the OpenDBX backend which seems to work much better
> regarding the sqlite3 incompatibility but also crashes during heavy transfer
> load.
> Setting the distributor-threads to 1 helps it somewhat, but I think there
> might be a different thread that messes things up.
>
> Log excerpt:
> Jul 18 16:45:40 Reading random entropy from '/dev/urandom'
> Jul 18 16:45:40  [OpendbxBackend] This is the opendbx module version 2.9.22
> (Jul 18 2009, 14:19:38) reporting
> Jul 18 16:45:40  [OpendbxBackend] This is the opendbx module version 2.9.22
> (Jul 18 2009, 14:19:28) reporting
> Jul 18 16:45:40 This is a standalone pdns
> Jul 18 16:45:40 It is advised to bind to explicit addresses with the
> --local-address option
> Jul 18 16:45:40 UDP server bound to 0.0.0.0:53
> Jul 18 16:45:40 TCP server bound to 0.0.0.0:53
> Jul 18 16:45:40 PowerDNS 2.9.22 (C) 2001-2009 PowerDNS.COM BV (Jul 18 2009,
> 14:23:39, gcc 4.3.2) starting up
> Jul 18 16:45:40 PowerDNS comes with ABSOLUTELY NO WARRANTY. This is free
> software, and you are welcome to redistribute it according to the terms of
> the GPL version 2.
> Jul 18 16:45:40 Set effective group id to 104
> Jul 18 16:45:40 Set effective user id to 102
> Jul 18 16:45:40 DNS Proxy launched, local port 27636, remote 127.0.0.1:1053
> Jul 18 16:45:40 Master/slave communicator launching
> Jul 18 16:45:40 Launched webserver on 127.0.0.1:8081
> Jul 18 16:45:40 Creating backend connection for TCP
> % Jul 18 16:45:40 [OpendbxBackend] Database connection (read) to
> '/srv/pdns/' succeeded
> Jul 18 16:45:40 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:40 211 slave domains need checking
> Jul 18 16:45:40 [OpendbxBackend] Database connection (read) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:40 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:40 About to create 1 backend threads for UDP
> Jul 18 16:45:40 [OpendbxBackend] Database connection (read) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:40 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:40 Done launching threads, ready to distribute questions
> Jul 18 16:45:47 Engaging bypass - now operating unthreaded
> Jul 18 16:45:47 [OpendbxBackend] Database connection (read) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 Received NOTIFY for stadsbudcentralen.nu from
> 213.132.111.158 for which we are not authoritative
> Jul 18 16:45:47 Created new slave zone 'stadsbudcentralen.nu' from
> supermaster 213.132.111.158, queued axfr
> Jul 18 16:45:47 Received NOTIFY for stadsbudcentralen.se from
> 213.132.111.158 for which we are not authoritative
> Jul 18 16:45:47 Initiating transfer of 'stadsbudcentralen.nu' from remote
> '213.132.111.158'
> Jul 18 16:45:47 [OpendbxBackend] Database connection (read) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 Created new slave zone 'stadsbudcentralen.se' from
> supermaster 213.132.111.158, queued axfr
> Jul 18 16:45:47 Received NOTIFY for westberg.info from 213.132.111.158 for
> which we are not authoritative
> Jul 18 16:45:47 AXFR started for 'stadsbudcentralen.nu', transaction started
> Jul 18 16:45:47 AXFR done for 'stadsbudcentralen.nu', zone committed
> Jul 18 16:45:47 Initiating transfer of 'stadsbudcentralen.se' from remote
> '213.132.111.158'
> Jul 18 16:45:47 [OpendbxBackend] Database connection (read) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 Created new slave zone 'westberg.info' from supermaster
> 213.132.111.158, queued axfr
> Jul 18 16:45:47 [OpendbxBackend] Database connection (write) to '/srv/pdns/'
> succeeded
> Jul 18 16:45:47 Received NOTIFY for trevlig.nu from 213.132.111.158 for
> which we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host trevlig.nu for
> potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for drproduction.com from 213.132.111.158
> for which we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host drproduction.com for
> potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for floopy.be from 213.132.111.158 for which
> we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host floopy.be for
> potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for gronastuganikalmar.se from
> 213.132.111.158 for which we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host gronastuganikalmar.se
> for potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for kigsrdr.org from 213.132.111.158 for
> which we are not authoritative
> Jul 18 16:45:47 [OpendbxBackend] getRecord: Unable to get next result -
> database is locked
> Jul 18 16:45:47 Communicator thread died because of error: Error:
> odbx_result() failed
> Jul 18 16:45:47 Unable to find backend willing to host kigsrdr.org for
> potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for kalmargamecenter.com from
> 213.132.111.158 for which we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host kalmargamecenter.com
> for potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for larvig.nu from 213.132.111.158 for which
> we are not authoritative
> Jul 18 16:45:47 Unable to find backend willing to host larvig.nu for
> potential supermaster 213.132.111.158
> Jul 18 16:45:47 Received NOTIFY for lyktan-vilshult.se from 213.132.111.158
> for which we are not authoritative
>
> This is an example where it happens quite early. Sometimes it gets a couple
> of hundred domains before it crashes.
> Tried with both gsqlite3 and opendbx.
>
> Note: I have only been able to test the distributor-threads = 1 with
> opendbx. But I would assume that option and thus the error is backend
> agnostic in that it only requires the backend to handle one write/read
> connection.
>
> Please advice.
>
> Christian "BC" Svensson
> Codelead Systems - http://www.codelead.se
> !DSPAM:4a61fdc1156835958412055!
> _______________________________________________
> Pdns-users mailing list
> Pdns-users@...
> http://mailman.powerdns.com/mailman/listinfo/pdns-users
>
>
> !DSPAM:4a61fdc1156835958412055!
>
>


_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

Re: SQLite3 problem during stress

by Norbert Sendetzky :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Christian

> After combating some weird incompatibility with Debian Lenny 5.0 where
> PDNSs gsqlite3 refused to write anything to the database it begun to crash
> when we did the initial transfer (notify of several hundred domains).
> I have tried with the OpenDBX backend which seems to work much better
> regarding the sqlite3 incompatibility but also crashes during heavy
> transfer load.

SQLite is only useful for small installations and a few domains because it
uses table locks when it writes the changes to the database. As each zone
transfer is a transaction that wants to write to the same table, no two
transfers can happen in parallel. Better use a different database for the
slave dns servers in your case. MySQL is the one with the highest read
performance but you can use whatever fits your needs.


Norbert
--
OpenPGP public key
http://www.linuxnetworks.de/norbert.pubkey.asc




_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

signature.asc (204 bytes) Download Attachment

Re: SQLite3 problem during stress

by bert hubert-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Jul 19, 2009 at 2:15 PM, Norbert
Sendetzky<norbert@...> wrote:

> Hi Christian
>
>> After combating some weird incompatibility with Debian Lenny 5.0 where
>> PDNSs gsqlite3 refused to write anything to the database it begun to crash
>> when we did the initial transfer (notify of several hundred domains).
>> I have tried with the OpenDBX backend which seems to work much better
>> regarding the sqlite3 incompatibility but also crashes during heavy
>> transfer load.
>
> SQLite is only useful for small installations and a few domains because it
> uses table locks when it writes the changes to the database. As each zone

That may be so, but it should still not crash.

    Bert
_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

Re: SQLite3 problem during stress

by Norbert Sendetzky :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun July 19 2009 19:21:56 bert hubert wrote:
> > SQLite is only useful for small installations and a few domains because
> > it uses table locks when it writes the changes to the database. As each
> > zone
>
> That may be so, but it should still not crash.

Like I wrote to Bert in a private mail:
It doesn't crash but the handling of timeouts is suboptimal. I will create a
patch that will improve this.


Norbert
--
OpenPGP public key
http://www.linuxnetworks.de/norbert.pubkey.asc




_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

signature.asc (204 bytes) Download Attachment

Re: SQLite3 problem during stress

by Christian Svensson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Yes, updating only one zone at the time is very much acceptable - that the initial transfer takes quite long time does not matter.

We do not want to use an "active" database due to memory / CPU footprint. If PowerDNS locks the table and let the other threads just wait for the lock like any other mutex it would probably be just fine with distributor-threads = 1.

I want to note that indeed PowerDNS _crashes_ / _exits_ in the end due to the "Database is locked" error. Since I'm currently testing with OpenDBX I don't know íf this is something that is regarded as an OpenDBX or PowerDNS bug - but gsqlite3 did also crash during heavy load which (as stated before) makes me believe that it's backend agnostic.

Greetings

Christian "BC" Svensson
Codelead Systems - http://www.codelead.se


On Sun, Jul 19, 2009 at 10:46 PM, Norbert Sendetzky <norbert@...> wrote:
On Sun July 19 2009 19:21:56 bert hubert wrote:
> > SQLite is only useful for small installations and a few domains because
> > it uses table locks when it writes the changes to the database. As each
> > zone
>
> That may be so, but it should still not crash.

Like I wrote to Bert in a private mail:
It doesn't crash but the handling of timeouts is suboptimal. I will create a
patch that will improve this.


Norbert
--
OpenPGP public key
http://www.linuxnetworks.de/norbert.pubkey.asc



_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users



_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

Re: SQLite3 problem during stress

by bert hubert-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Jul 20, 2009 at 11:01 AM, Christian Svensson<blue@...> wrote:

> Yes, updating only one zone at the time is very much acceptable - that the
> initial transfer takes quite long time does not matter.
>
> We do not want to use an "active" database due to memory / CPU footprint. If
> PowerDNS locks the table and let the other threads just wait for the lock
> like any other mutex it would probably be just fine with distributor-threads
> = 1.
>
> I want to note that indeed PowerDNS _crashes_ / _exits_ in the end due to
> the "Database is locked" error. Since I'm currently testing with OpenDBX I
> don't know íf this is something that is regarded as an OpenDBX or PowerDNS
> bug - but gsqlite3 did also crash during heavy load which (as stated before)
> makes me believe that it's backend agnostic.

PowerDNS does not crash under heavy load normally. Even with
distributor-threads=1, you may get multiple active backends. This may
explain the locking issues. Perhaps Norbert is right that our sqlite3
code needs to deal better with timeouts.

I currently have the entire .NET zone (28 million records) loaded in
sqlite3 and will stresstest it later this week.

Will let you know.

     Bert
_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

Re: SQLite3 problem during stress

by Oli Schacher-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

We had this problem too when we moved all domains to pdns (hidden master
on mysql, multiple sqlite3 slaves ). The crash happened when I
'notified' about 40 domains simultaneously for the initial transfer.

See http://mailman.powerdns.com/pipermail/pdns-users/2008-April/005287.html


Since then pdns is running stable on all slaves (~350 domains, ~30
queries per second per slave)

_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

Re: SQLite3 problem during stress

by Norbert Sendetzky :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon July 20 2009 11:01:05 you wrote:
> Yes, updating only one zone at the time is very much acceptable - that the
> initial transfer takes quite long time does not matter.
>
> We do not want to use an "active" database due to memory / CPU footprint.
> If PowerDNS locks the table and let the other threads just wait for the
> lock like any other mutex it would probably be just fine with
> distributor-threads = 1.

The problem of the SQLite database is that while a zone transfer is in
progress, all other threads can't serve any records. This is because the
complete table is locked by SQLite when records are updated and this also
prevents reading records in other zones. Please keep this in mind for your
installation.

> I want to note that indeed PowerDNS _crashes_ / _exits_ in the end due to
> the "Database is locked" error. Since I'm currently testing with OpenDBX I
> don't know íf this is something that is regarded as an OpenDBX or PowerDNS
> bug - but gsqlite3 did also crash during heavy load which (as stated
> before) makes me believe that it's backend agnostic.

What you've shown me in your logs is that the PowerDNS opendbx backend gets an
error from the SQLite library that the database is locked and the opendbx
backend forces the distributor thread to recreate the connection to the
database. The problem is that the sqlite3 backend of the OpenDBX library
returns a fatal error when the database is locked instead of a timeout. This
is fixed in the SVN trunk and the fix will be also be included in OpenDBX
1.4.2. The next problem is that the PowerDNS opendbx backend doesn't make use
of timeouts at the moment and treats them as errors. I've already enhanced the
PowerDNS opendbx backend but I have to test it as soon as I can and see if the
problem is gone. I will create a patch and send it to you so you can test
yourself.

What I haven't seen in your log file excerpt is that the opendbx backend
crashed with a segfault.


Norbert
--
OpenPGP public key
http://www.linuxnetworks.de/norbert.pubkey.asc




_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

signature.asc (204 bytes) Download Attachment

Re: SQLite3 problem during stress

by Christian Svensson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The crash it self is not a segfault of what I can see.

I test it by running it in "monitor" mode and suddenly it just drops back to the shell.
It might indeed be solved by your latest patch - I would gladly test it.

Greetings,

Christian "BC" Svensson
Codelead Systems - http://www.codelead.se


On Tue, Jul 21, 2009 at 10:27 PM, Norbert Sendetzky <norbert@...> wrote:
On Mon July 20 2009 11:01:05 you wrote:
> Yes, updating only one zone at the time is very much acceptable - that the
> initial transfer takes quite long time does not matter.
>
> We do not want to use an "active" database due to memory / CPU footprint.
> If PowerDNS locks the table and let the other threads just wait for the
> lock like any other mutex it would probably be just fine with
> distributor-threads = 1.

The problem of the SQLite database is that while a zone transfer is in
progress, all other threads can't serve any records. This is because the
complete table is locked by SQLite when records are updated and this also
prevents reading records in other zones. Please keep this in mind for your
installation.

> I want to note that indeed PowerDNS _crashes_ / _exits_ in the end due to
> the "Database is locked" error. Since I'm currently testing with OpenDBX I
> don't know íf this is something that is regarded as an OpenDBX or PowerDNS
> bug - but gsqlite3 did also crash during heavy load which (as stated
> before) makes me believe that it's backend agnostic.

What you've shown me in your logs is that the PowerDNS opendbx backend gets an
error from the SQLite library that the database is locked and the opendbx
backend forces the distributor thread to recreate the connection to the
database. The problem is that the sqlite3 backend of the OpenDBX library
returns a fatal error when the database is locked instead of a timeout. This
is fixed in the SVN trunk and the fix will be also be included in OpenDBX
1.4.2. The next problem is that the PowerDNS opendbx backend doesn't make use
of timeouts at the moment and treats them as errors. I've already enhanced the
PowerDNS opendbx backend but I have to test it as soon as I can and see if the
problem is gone. I will create a patch and send it to you so you can test
yourself.

What I haven't seen in your log file excerpt is that the opendbx backend
crashed with a segfault.


Norbert
--
OpenPGP public key
http://www.linuxnetworks.de/norbert.pubkey.asc



_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users



_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

Re: SQLite3 problem during stress

by Norbert Sendetzky :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue July 28 2009 16:33:28 you wrote:
> It might indeed be solved by your latest patch - I would gladly test it.

Sorry, I wasn't able to test the patch I've written in detail but maybe it
already solves your problem. You must use it in combination with the latest
version of OpenDBX from the SVN trunk.


Norbert
--
OpenPGP public key
http://www.linuxnetworks.de/norbert.pubkey.asc



[opendbxbackend-timeout.diff]

Index: odbxbackend.hh
===================================================================
--- odbxbackend.hh (revision 1363)
+++ odbxbackend.hh (working copy)
@@ -56,6 +56,7 @@
  string m_myname;
  string m_qname;
  int m_default_ttl;
+ int m_timeout;
  bool m_qlog;
  odbx_t* m_handle[2];
  odbx_result_t* m_result;
@@ -116,6 +117,7 @@
  declare( suffix, "database", "Database name containing the DNS records","powerdns" );
  declare( suffix, "username","User for connecting to the DBMS","powerdns");
  declare( suffix, "password","Password for connecting to the DBMS","");
+ declare( suffix, "timeout","Timeout in seconds to wait for the database","5");
 
  declare( suffix, "sql-list", "AXFR query", "SELECT r.\"domain_id\", r.\"name\", r.\"type\", r.\"ttl\", r.\"prio\", r.\"content\" FROM \"records\" r WHERE r.\"domain_id\"=:id" );
 
Index: odbxbackend.cc
===================================================================
--- odbxbackend.cc (revision 1363)
+++ odbxbackend.cc (working copy)
@@ -45,6 +45,8 @@
 
  setArgPrefix( "opendbx" + suffix );
 
+ m_timeout = arg().asNum( "timeout" );
+
  if( getArg( "host" ).size() > 0 )
  {
  L.log( m_myname + " WARNING: Using depricated opendbx-host parameter", Logger::Warning );
@@ -194,7 +196,7 @@
  if( ( tmp = odbx_field_value( m_result, 2 ) ) != NULL )
  {
  sd.ttl = strtoul( tmp, NULL, 10 );
- }
+ }
 
  if( sd.serial == 0 && ( tmp = odbx_field_value( m_result, 1 ) ) != NULL )
  {
Index: odbxprivate.cc
===================================================================
--- odbxprivate.cc (revision 1363)
+++ odbxprivate.cc (working copy)
@@ -80,12 +80,16 @@
 bool OdbxBackend::getRecord( QueryType type )
 {
  int err = 3;
+ struct timeval timeout;
 
 
  DLOG( L.log( m_myname + " getRecord()", Logger::Debug ) );
 
  do
  {
+ timeout.tv_sec = m_timeout;
+ timeout.tv_usec = 0;
+
  if( err < 0 )
  {
  L.log( m_myname + " getRecord: Unable to get next result - " + string( odbx_error( m_handle[type], err ) ),  Logger::Error );
@@ -94,46 +98,54 @@
 
  if( m_result != NULL )
  {
- if( err == 3 )
+ switch( err )
  {
- if( ( err = odbx_row_fetch( m_result ) ) < 0 )
- {
- L.log( m_myname + " getRecord: Unable to get next row - " + string( odbx_error( m_handle[type], err ) ),  Logger::Error );
- throw( AhuException( "Error: odbx_row_fetch() failed" ) );
- }
+ case 1: // Timeout
 
- if( err > 0 )
- {
-#ifdef VERBOSELOG
- unsigned int i;
- string fields;
+ L.log( m_myname + " getRecord: Timeout reached for retrieving '" + m_qname + "'",  Logger::Error );
+ break;
 
- for( i = 0; i < odbx_column_count( m_result ); i++ )
+ case 3: // Rows available
+
+ if( ( err = odbx_row_fetch( m_result ) ) < ODBX_ROW_DONE )
  {
- fields += string( odbx_column_name( m_result, i ) );
+ L.log( m_myname + " getRecord: Unable to get next row - " + string( odbx_error( m_handle[type], err ) ),  Logger::Error );
+ throw( AhuException( "Error: odbx_row_fetch() failed" ) );
+ }
 
- if( odbx_field_value( m_result, i ) != NULL )
+ if( err > ODBX_ROW_DONE )
+ {
+#ifdef VERBOSELOG
+ unsigned int i;
+ string fields;
+
+ for( i = 0; i < odbx_column_count( m_result ); i++ )
  {
- fields += "=" + string( odbx_field_value( m_result, i ) ) + ", ";
+ fields += string( odbx_column_name( m_result, i ) );
+
+ if( odbx_field_value( m_result, i ) != NULL )
+ {
+ fields += "=" + string( odbx_field_value( m_result, i ) ) + ", ";
+ }
+ else
+ {
+ fields += "=NULL, ";
+ }
  }
- else
- {
- fields += "=NULL, ";
- }
- }
 
- L.log( m_myname + " Values: " + fields,  Logger::Error );
+ L.log( m_myname + " Values: " + fields,  Logger::Error );
 #endif
- return true;
- }
+ return true;
+ }
 
+ break;
  }
 
  odbx_result_free( m_result );
  m_result = NULL;
  }
  }
- while( ( err =  odbx_result( m_handle[type], &m_result, NULL, 0 ) ) != 0 );
+ while( ( err =  odbx_result( m_handle[type], &m_result, &timeout, 0 ) ) != ODBX_RES_DONE );
 
  m_result = NULL;
  return false;



_______________________________________________
Pdns-users mailing list
Pdns-users@...
http://mailman.powerdns.com/mailman/listinfo/pdns-users

signature.asc (204 bytes) Download Attachment