|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
[jira] Created: (DERBY-4437) Concurrent inserts into table with identity column perform poorlyConcurrent inserts into table with identity column perform poorly
----------------------------------------------------------------- Key: DERBY-4437 URL: https://issues.apache.org/jira/browse/DERBY-4437 Project: Derby Issue Type: Improvement Components: SQL Affects Versions: 10.5.3.0 Reporter: Knut Anders Hatlen I have a multi-threaded application which is very insert-intensive. I've noticed that it sometimes can come into a state where it slows down considerably and basically becomes single-threaded. This is especially harmful on modern multi-core machines since most of the available resources are left idle. The problematic tables contain identity columns, and here's my understanding of what happens: 1) Identity columns are generated from a counter that's stored in a row in SYS.SYSCOLUMNS. During normal operation, the counter is maintained in a nested transaction within the transaction that performs the insert. This allows the nested transaction to commit the changes to SYS.SYSCOLUMN separately from the main transaction, and the exclusive lock that it needs to obtain on the row holding the counter, can be releases after a relatively short time. Concurrent transactions can therefore insert into the same table at the same time, without needing to wait for the others to commit or abort. 2) However, if the nested transaction cannot lock the row in SYS.SYSCOLUMNS immediately, it will give up and retry the operation in the main transaction. This prevents self-deadlocks in the case where the main transaction already owns a lock on SYS.SYSCOLUMNS. Unfortunately, this also increases the time the row is locked, since the exclusive lock cannot be released until the main transaction commits. So as soon as there is one lock collision, the waiting transaction changes to a locking mode that increases the chances of others having to wait, which seems to result in all insert threads having to obtain the SYSCOLUMNS locks in the main transaction. The end result is that only one of the insert threads can execute at any given time as long as the application is in this state. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (DERBY-4437) Concurrent inserts into table with identity column perform poorly[ https://issues.apache.org/jira/browse/DERBY-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774270#action_12774270 ] Knut Anders Hatlen commented on DERBY-4437: ------------------------------------------- I haven't investigated this enough to say if (1) or (2) is the real problem. Since the nested transaction has to commit before it can release the lock, and a commit may need to wait for disk I/O operations, it may be that (2) is just a symptom, and the real problem is that all the insert threads compete for the same row lock. In my application, I could work around the problem by removing the identity column and instead maintain a counter in an AtomicInteger that's initialized by a SELECT MAX(id) query on start-up. This works because the application is one single process with multiple threads, so all threads have access to the AtomicInteger. If the clients run in different processes, such a workaround cannot be used, though. > Concurrent inserts into table with identity column perform poorly > ----------------------------------------------------------------- > > Key: DERBY-4437 > URL: https://issues.apache.org/jira/browse/DERBY-4437 > Project: Derby > Issue Type: Improvement > Components: SQL > Affects Versions: 10.5.3.0 > Reporter: Knut Anders Hatlen > > I have a multi-threaded application which is very insert-intensive. I've noticed that it sometimes can come into a state where it slows down considerably and basically becomes single-threaded. This is especially harmful on modern multi-core machines since most of the available resources are left idle. > The problematic tables contain identity columns, and here's my understanding of what happens: > 1) Identity columns are generated from a counter that's stored in a row in SYS.SYSCOLUMNS. During normal operation, the counter is maintained in a nested transaction within the transaction that performs the insert. This allows the nested transaction to commit the changes to SYS.SYSCOLUMN separately from the main transaction, and the exclusive lock that it needs to obtain on the row holding the counter, can be releases after a relatively short time. Concurrent transactions can therefore insert into the same table at the same time, without needing to wait for the others to commit or abort. > 2) However, if the nested transaction cannot lock the row in SYS.SYSCOLUMNS immediately, it will give up and retry the operation in the main transaction. This prevents self-deadlocks in the case where the main transaction already owns a lock on SYS.SYSCOLUMNS. Unfortunately, this also increases the time the row is locked, since the exclusive lock cannot be released until the main transaction commits. So as soon as there is one lock collision, the waiting transaction changes to a locking mode that increases the chances of others having to wait, which seems to result in all insert threads having to obtain the SYSCOLUMNS locks in the main transaction. The end result is that only one of the insert threads can execute at any given time as long as the application is in this state. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (DERBY-4437) Concurrent inserts into table with identity column perform poorly[ https://issues.apache.org/jira/browse/DERBY-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Matrigali updated DERBY-4437: ---------------------------------- I would not be surprised if good performance gains could be gotten in this area, as I don't believe any optimization has happened. The code definitely predates today's processors and multi-core machines. I agree with both your assessments. Some work that could be done in this area: 1) The system tries to limit the number of times that it goes single threaded by allocating a group of numbers every time it goes to update the system catalog. This number is probably too low for a multicore insert as fast as it can system. As a test you could try to just bump this number to make sure it helps your app. A better derby fix would be to make the fix somehow more zero-admin. Maybe by tracking how often the value is being updated and dynamically bump it up and down. Up seems easy, not exactly sure how to make it go down. The downside of a big number is that values are lost when the system shuts down. 2) The current lock strategy is based on what was available from the lock manager when it was implemented. There may be better options. What the system really wants to do is to do an unlimited wait unless it is waiting on itself. For a normal application that does not do system catalog queries the normal case is that a hit on this lock is not going to be a self deadlock. So a quick fix might be to add a retry, or add a longer wait on the lock. A best fix would be a new lock manager interfaces that allowed it to wait for as long as needed while insuring it was not waiting on parent transaction. > Concurrent inserts into table with identity column perform poorly > ----------------------------------------------------------------- > > Key: DERBY-4437 > URL: https://issues.apache.org/jira/browse/DERBY-4437 > Project: Derby > Issue Type: Improvement > Components: SQL > Affects Versions: 10.5.3.0 > Reporter: Knut Anders Hatlen > > I have a multi-threaded application which is very insert-intensive. I've noticed that it sometimes can come into a state where it slows down considerably and basically becomes single-threaded. This is especially harmful on modern multi-core machines since most of the available resources are left idle. > The problematic tables contain identity columns, and here's my understanding of what happens: > 1) Identity columns are generated from a counter that's stored in a row in SYS.SYSCOLUMNS. During normal operation, the counter is maintained in a nested transaction within the transaction that performs the insert. This allows the nested transaction to commit the changes to SYS.SYSCOLUMN separately from the main transaction, and the exclusive lock that it needs to obtain on the row holding the counter, can be releases after a relatively short time. Concurrent transactions can therefore insert into the same table at the same time, without needing to wait for the others to commit or abort. > 2) However, if the nested transaction cannot lock the row in SYS.SYSCOLUMNS immediately, it will give up and retry the operation in the main transaction. This prevents self-deadlocks in the case where the main transaction already owns a lock on SYS.SYSCOLUMNS. Unfortunately, this also increases the time the row is locked, since the exclusive lock cannot be released until the main transaction commits. So as soon as there is one lock collision, the waiting transaction changes to a locking mode that increases the chances of others having to wait, which seems to result in all insert threads having to obtain the SYSCOLUMNS locks in the main transaction. The end result is that only one of the insert threads can execute at any given time as long as the application is in this state. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (DERBY-4437) Concurrent inserts into table with identity column perform poorly[ https://issues.apache.org/jira/browse/DERBY-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774331#action_12774331 ] Rick Hillegas commented on DERBY-4437: -------------------------------------- Hopefully, the solution will be something that we can re-use for sequence generators (DERBY-712). As I read the SQL Standard, a sequence should "normally" not have any gaps but no guarantees are made and it is hard to understand how holes won't turn up since the sequence is not affected by rollbacks and the sequence is supposed to change monotonically in one direction or another. Pre-allocating a block of sequence numbers (Mike's solution #1) is attractive, particularly if we can release the unused ids when the database is brought down in an orderly fashion. I like the idea that the size of that block is self-tuning. > Concurrent inserts into table with identity column perform poorly > ----------------------------------------------------------------- > > Key: DERBY-4437 > URL: https://issues.apache.org/jira/browse/DERBY-4437 > Project: Derby > Issue Type: Improvement > Components: SQL > Affects Versions: 10.5.3.0 > Reporter: Knut Anders Hatlen > > I have a multi-threaded application which is very insert-intensive. I've noticed that it sometimes can come into a state where it slows down considerably and basically becomes single-threaded. This is especially harmful on modern multi-core machines since most of the available resources are left idle. > The problematic tables contain identity columns, and here's my understanding of what happens: > 1) Identity columns are generated from a counter that's stored in a row in SYS.SYSCOLUMNS. During normal operation, the counter is maintained in a nested transaction within the transaction that performs the insert. This allows the nested transaction to commit the changes to SYS.SYSCOLUMN separately from the main transaction, and the exclusive lock that it needs to obtain on the row holding the counter, can be releases after a relatively short time. Concurrent transactions can therefore insert into the same table at the same time, without needing to wait for the others to commit or abort. > 2) However, if the nested transaction cannot lock the row in SYS.SYSCOLUMNS immediately, it will give up and retry the operation in the main transaction. This prevents self-deadlocks in the case where the main transaction already owns a lock on SYS.SYSCOLUMNS. Unfortunately, this also increases the time the row is locked, since the exclusive lock cannot be released until the main transaction commits. So as soon as there is one lock collision, the waiting transaction changes to a locking mode that increases the chances of others having to wait, which seems to result in all insert threads having to obtain the SYSCOLUMNS locks in the main transaction. The end result is that only one of the insert threads can execute at any given time as long as the application is in this state. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
| Free embeddable forum powered by Nabble | Forum Help |