how do I avoid a connection pool filling up?

View: New views
5 Messages — Rating Filter:   Alert me  

how do I avoid a connection pool filling up?

by Dick Davies :: Rate this Message:

| View Threaded | Show Only this Message

I got this error on my cluster over the weekend:

[#|2009-01-10T14:33:06.521+0000|WARNING|sun-appserver9.1|javax.enterprise.resource.resourceadapter|_ThreadID=154;
_ThreadName=httpSSLWorkerThread-39181-21;oracle-prod-pool;In-use
connections equal max-pool-size and expired max-
wait-time. Cannot allocate more
connections.;_RequestID=5e1a41cd-1a74-4a5e-8d3b-2582599ac0b6;|RAR5117
: Failed to
 obtain/create connection from connection pool [ oracle-prod-pool ].
Reason : In-use connections equal max-pool-s
ize and expired max-wait-time. Cannot allocate more connections.|#]

which I read as 'your connection pool is full, and I've been waiting
too long, so here's an Exception'.


Both appservers started giving 500s and the site was down for an hour or two.
It looks like there was some kind of network blip that stalled a few
connections, when I restarted the glassfish instances
everything was back up and running.



So I'm wondering how I tell Glassfish to step in and kill of idle
connections after a certain amount of time.

in the admin console I see a couple of options, I'm wonder

The first is Resources> JDBC> Connection Pools> oracle-prod-pool -> advanced

set a leak timeout and then tick 'leak reclaim'.

The second is  Resources> JDBC> Connection Pools> oracle-prod-pool -> general

enable 'connection validation' and then 'on any failure' tick 'close
all connections'

Which do you folks think would be most likely to avoid this recurring
without impacting on the app (Roller 4 using openjpa)?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...


Re: how do I avoid a connection pool filling up?

by Steve Essery :: Rate this Message:

| View Threaded | Show Only this Message

Dick Davies wrote:

> I got this error on my cluster over the weekend:
>
> [#|2009-01-10T14:33:06.521+0000|WARNING|sun-appserver9.1|javax.enterprise.resource.resourceadapter|_ThreadID=154;
> _ThreadName=httpSSLWorkerThread-39181-21;oracle-prod-pool;In-use
> connections equal max-pool-size and expired max-
> wait-time. Cannot allocate more
> connections.;_RequestID=5e1a41cd-1a74-4a5e-8d3b-2582599ac0b6;|RAR5117
> : Failed to
>  obtain/create connection from connection pool [ oracle-prod-pool ].
> Reason : In-use connections equal max-pool-s
> ize and expired max-wait-time. Cannot allocate more connections.|#]
>
> which I read as 'your connection pool is full, and I've been waiting
> too long, so here's an Exception'.
>  
Yes, that's the meaning.  The question is why have you hit the maximum
number of connections, and you head in the right direction below... You
could also look at turning on JDBC monitoring and using the asadmin
monitor command to capture statistics for the pool and see what the
connection usage is like over time, as well as seeing how many
connection requests are being forced to wait as the pool is fully in use.

>
> Both appservers started giving 500s and the site was down for an hour or two.
> It looks like there was some kind of network blip that stalled a few
> connections, when I restarted the glassfish instances
> everything was back up and running.
>
>
>
> So I'm wondering how I tell Glassfish to step in and kill of idle
> connections after a certain amount of time.
>
> in the admin console I see a couple of options, I'm wonder
>
> The first is Resources> JDBC> Connection Pools> oracle-prod-pool -> advanced
>
> set a leak timeout and then tick 'leak reclaim'.
>  
You might want to start off just turning on the leak detection and
setting a timeout and seeing if there are connections leaking.  If they
are you'll see a call stack in the server.log indicating the state of
the thread at the point the application code fetched a connection from
the pool.  Working down the stack you'll see where in the application
the connection was obtained.  The question then would be why wasn't that
connection returned to the pool before the leak timeout expired?  Is
there a Connection.close() missing for example.  Unfortunately I've
quite often see issues in application code where its assume there will
be no problems performing JDBC operations and because of exceptions JDBC
resources such as Connections, Statements and ResultSets aren't closed
correctly - even in a finally block.

If there is a leak turning on reclaim simply destroys the connection and
replaces it in the pool with a fresh connection.  If the application is
caching a connection for some reason and tries to use it once its been
reclaimed in this way it will obviously get an exception.
> The second is  Resources> JDBC> Connection Pools> oracle-prod-pool -> general
>
> enable 'connection validation' and then 'on any failure' tick 'close
> all connections'
>  
Another possibility is that validation isn't enabled, or isn't correctly
configured for the database in question.  Where this matters typically
is in the situation where the database is shutdown while the application
server has a pool of connections open to it.  If the pool isn't
validating, or is using the wrong mechanism, then those connections,
which are no longer connected (and hence useless) will sit in the pool
until they finally exceed the idle timeout.  Unfortunately the pool will
still pass these connections to application code (it has no way of
knowing they are unusable without proper validation) which resets the
idle time on them.... on a busy pool this could mean they persist for a
significant period of time.  I'd expect more signs of this in the
server.log - the hope being that application code that uses such a
connection will report the SQLException they should get from attempting
any JDBC operation on it.

With Oracle the only workable validation mechanism is table-based.  
Since DUAL is a table that's been present in Oracle as long as I can
recall (I go back to v6) that's the table I'd suggest.  Table based
validation works by parsing and executing a select statement on the
table, but it does not fetch back any data from the table.  The parse
makes the Oracle JDBC driver contact the backend database, which
essentially validates that the connection is still good.  This is
effectively the same mechanism as used in the Oracle propriatary
OracleConnection.pingDatabase() method.

The problems with the other two methods are as follows.  The auto-commit
mechanism essentially looks at the current setting of the auto commit
flag in the driver and toggles to the opposite and then back to what it
was originally.  Oracle's JDBC driver, and probably others, cache this
status and avoid a network hop to the server until they absolutely have
to so they reduce the amount of network traffic they generate.

The issue is similar with the DB MetaData method.  Here a DBMetaData
object is created, however the Oracle JDBC driver again defers any
network trips until a call is made to a method that has to contact the
database to return the required data.  With the enterprise version of
the Sun Glassfish Application Server the JDBC driver for the HADB
database does make a connection to its database so DBMetaData is used
for validation with pools to HADB.

Finally a firewall between the Application Server machine and the
database, which is configured to drop idle connections, can also cause
complications.  Most firewalls I've encountered do not drop connections
cleanly, i.e. they don't send a RST package to the client to know that
the connection has been closed.  This means when such a terminated
connection is used in the Application Server, any attempt by the
connection to talk to the backend database will hang, normal TCP/IP
reliable delivery kicks in and multiple attempts will be made to send
TCP/IP packets to the database - only when all those attempts expire
(typically around 8 minutes) will the problem result in a SQLException
being returned.  This essentially could cause a bottleneck for a period
of time where there could be a number of dead/hung connections
effectively reducing the number of available connections in the pool
that can be used. If the pool is only just big enough to support the
load placed on it, this could be enough to lead to requests waiting, and
possibily exceeding the wait timeout.  Making sure the idle timeout in
the pool is less than the firewall timeout should be enough nowadays to
avoid this situation.
> Which do you folks think would be most likely to avoid this recurring
> without impacting on the app (Roller 4 using openjpa)?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@...
> For additional commands, e-mail: users-help@...
>
>  


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...


Re: how do I avoid a connection pool filling up?

by Dick Davies :: Rate this Message:

| View Threaded | Show Only this Message

On Tue, Jan 13, 2009 at 11:00 AM, Steve Essery <Steve.Essery@...> wrote:

> Dick Davies wrote:
>>
>> I got this error on my cluster over the weekend:
>>
>>
>> [#|2009-01-10T14:33:06.521+0000|WARNING|sun-appserver9.1|javax.enterprise.resource.resourceadapter|_ThreadID=154;
>> _ThreadName=httpSSLWorkerThread-39181-21;oracle-prod-pool;In-use
>> connections equal max-pool-size and expired max-
>> wait-time. Cannot allocate more
>> connections.;_RequestID=5e1a41cd-1a74-4a5e-8d3b-2582599ac0b6;|RAR5117
>> : Failed to
>>  obtain/create connection from connection pool [ oracle-prod-pool ].
>> Reason : In-use connections equal max-pool-s
>> ize and expired max-wait-time. Cannot allocate more connections.|#]
>>
>> which I read as 'your connection pool is full, and I've been waiting
>> too long, so here's an Exception'.
>>
>
> Yes, that's the meaning.  The question is why have you hit the maximum
> number of connections, and you head in the right direction below... You
> could also look at turning on JDBC monitoring and using the asadmin monitor
> command to capture statistics for the pool and see what the connection usage
> is like over time, as well as seeing how many connection requests are being
> forced to wait as the pool is fully in use.

Thanks, I've used that in the past but with asadmin get/list to check
from time to time,
not with asadmin monitor - does that give some sort of historical graphing
(a link to a howto would be really useful if you have one)?


>> Both appservers started giving 500s and the site was down for an hour or
>> two.
>> It looks like there was some kind of network blip that stalled a few
>> connections, when I restarted the glassfish instances
>> everything was back up and running.
>>
>>
>>
>> So I'm wondering how I tell Glassfish to step in and kill of idle
>> connections after a certain amount of time.
>>
>> in the admin console I see a couple of options, I'm wonder
>>
>> The first is Resources> JDBC> Connection Pools> oracle-prod-pool ->
>> advanced
>>
>> set a leak timeout and then tick 'leak reclaim'.
>>
>
> You might want to start off just turning on the leak detection and setting a
> timeout and seeing if there are connections leaking.  If they are you'll see
> a call stack in the server.log indicating the state of the thread at the
> point the application code fetched a connection from the pool.  Working down
> the stack you'll see where in the application the connection was obtained.
>  The question then would be why wasn't that connection returned to the pool
> before the leak timeout expired?  Is there a Connection.close() missing for
> example.  Unfortunately I've quite often see issues in application code
> where its assume there will be no problems performing JDBC operations and
> because of exceptions JDBC resources such as Connections, Statements and
> ResultSets aren't closed correctly - even in a finally block.

Well I've had leak timeout set for a while (though not leak reclaim)
and don't recall seeing
any exceptions, so  hopefully I'm ok there.

The keepalive timeout on the load balancer is less than the connection
pool leak timeout,
which makes me think I should be safe enabling leak reclaim
(users will have lost their HTTP connection before an exception bubbles up),
but I'm inclined to only tweak one thing at a time, and I'm pretty
sure you nailed it on the
validation mechanism issue (see below).

>> The second is  Resources> JDBC> Connection Pools> oracle-prod-pool ->
>> general
>>
>> enable 'connection validation' and then 'on any failure' tick 'close
>> all connections'
>>

> With Oracle the only workable validation mechanism is table-based.  Since
> DUAL is a table that's been present in Oracle as long as I can recall (I go
> back to v6) that's the table I'd suggest.  Table based validation works by
> parsing and executing a select statement on the table, but it does not fetch
> back any data from the table.  The parse makes the Oracle JDBC driver
> contact the backend database, which essentially validates that the
> connection is still good.  This is effectively the same mechanism as used in
> the Oracle propriatary OracleConnection.pingDatabase() method.

BINGO! That's very likely it, thanks a lot. I've switched table (DUAL)
validation and will
see how that goes.

Thanks a lot!

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...


Re: how do I avoid a connection pool filling up?

by Steve Essery :: Rate this Message:

| View Threaded | Show Only this Message

Dick Davies wrote:
On Tue, Jan 13, 2009 at 11:00 AM, Steve Essery Steve.Essery@... wrote:
  
Dick Davies wrote:
    
I got this error on my cluster over the weekend:


[#|2009-01-10T14:33:06.521+0000|WARNING|sun-appserver9.1|javax.enterprise.resource.resourceadapter|_ThreadID=154;
_ThreadName=httpSSLWorkerThread-39181-21;oracle-prod-pool;In-use
connections equal max-pool-size and expired max-
wait-time. Cannot allocate more
connections.;_RequestID=5e1a41cd-1a74-4a5e-8d3b-2582599ac0b6;|RAR5117
: Failed to
 obtain/create connection from connection pool [ oracle-prod-pool ].
Reason : In-use connections equal max-pool-s
ize and expired max-wait-time. Cannot allocate more connections.|#]

which I read as 'your connection pool is full, and I've been waiting
too long, so here's an Exception'.

      
Yes, that's the meaning.  The question is why have you hit the maximum
number of connections, and you head in the right direction below... You
could also look at turning on JDBC monitoring and using the asadmin monitor
command to capture statistics for the pool and see what the connection usage
is like over time, as well as seeing how many connection requests are being
forced to wait as the pool is fully in use.
    

Thanks, I've used that in the past but with asadmin get/list to check
from time to time,
not with asadmin monitor - does that give some sort of historical graphing
(a link to a howto would be really useful if you have one)?
  
Its unfortunately not historical - its more like iostat than sar - reporting on current monitoring information rather than past aggregrations.

Most of it should be in the asadmin monitor --help command.

1) Enable monitoring for JDBC in the configuration your instances are part of.
2) Once the pools are in use you can use:

asadmin monitor --user <admin user> --port <admin-port>  --type jdbcpool --filter <pool name> <instance-name>

A pool that has never been used won't have been created by the application server and hence will not be monitorable.

If you are running it in a terminal window, hitting "h" will give you a key explaining the column headings.  If you add --filename <somefile> then the terminal output is also written to the specified file in CSV format.

If you have more than one pool to look at you can quickly find the values for the the --filter argument by leaving that out of the command, the asadmin monitor will then list the possible choices, e.g.

asadmin monitor --user <admin user> --port <admin-port>  --type jdbcpool <instance-name>

Some of the "average" statistics unfortunately can be less than helpful over a long period of time, e.g. if you have a long running system and there is a sudden short peak in requests waiting for connections, that spike can be completely lost in the average wait time per request statistic - only the current number of waiting connections would show the problem.

  
Both appservers started giving 500s and the site was down for an hour or
two.
It looks like there was some kind of network blip that stalled a few
connections, when I restarted the glassfish instances
everything was back up and running.



So I'm wondering how I tell Glassfish to step in and kill of idle
connections after a certain amount of time.

in the admin console I see a couple of options, I'm wonder

The first is Resources> JDBC> Connection Pools> oracle-prod-pool ->
advanced

set a leak timeout and then tick 'leak reclaim'.

      
You might want to start off just turning on the leak detection and setting a
timeout and seeing if there are connections leaking.  If they are you'll see
a call stack in the server.log indicating the state of the thread at the
point the application code fetched a connection from the pool.  Working down
the stack you'll see where in the application the connection was obtained.
 The question then would be why wasn't that connection returned to the pool
before the leak timeout expired?  Is there a Connection.close() missing for
example.  Unfortunately I've quite often see issues in application code
where its assume there will be no problems performing JDBC operations and
because of exceptions JDBC resources such as Connections, Statements and
ResultSets aren't closed correctly - even in a finally block.
    

Well I've had leak timeout set for a while (though not leak reclaim)
and don't recall seeing
any exceptions, so  hopefully I'm ok there.

The keepalive timeout on the load balancer is less than the connection
pool leak timeout,
which makes me think I should be safe enabling leak reclaim
(users will have lost their HTTP connection before an exception bubbles up),
but I'm inclined to only tweak one thing at a time, and I'm pretty
sure you nailed it on the
validation mechanism issue (see below).

  
The second is  Resources> JDBC> Connection Pools> oracle-prod-pool ->
general

enable 'connection validation' and then 'on any failure' tick 'close
all connections'

      

  
With Oracle the only workable validation mechanism is table-based.  Since
DUAL is a table that's been present in Oracle as long as I can recall (I go
back to v6) that's the table I'd suggest.  Table based validation works by
parsing and executing a select statement on the table, but it does not fetch
back any data from the table.  The parse makes the Oracle JDBC driver
contact the backend database, which essentially validates that the
connection is still good.  This is effectively the same mechanism as used in
the Oracle propriatary OracleConnection.pingDatabase() method.
    

BINGO! That's very likely it, thanks a lot. I've switched table (DUAL)
validation and will
see how that goes.
  
It is probably the number one thing I tell customers to configure.  Once validation is turned on and we know stale/dead connections in the pool are not part of the problem we can move on to figure out what is.  From the one pager for JDBC posted recently the intention would seem to be to make table-based validation the default for all pools in V3 - there was an enhancement done for a particular patch release of Sun's AS 8.1 which did this purely for Oracle pools but that change wasn't forwarded ported to AS 9.1/Glassfish.
Thanks a lot!

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...

  


Re: how do I avoid a connection pool filling up?

by Dick Davies :: Rate this Message:

| View Threaded | Show Only this Message

On Wed, Jan 14, 2009 at 4:08 PM, Steve Essery <Steve.Essery@...> wrote:

> Most of it should be in the asadmin monitor --help command.

Thanks. If it's not historical, that's OK, I can write something to
hook into net-snmp
on the box maybe, then poll it from our Cacti system and graph it there.

I'll have a poke around when I get to work; been meaning to play with
it out anyway, this is a good excuse.

>From the one pager for JDBC posted recently the intention would seem to be to make
> table-based validation the default for all pools in V3

I was just about to read that, thanks :)

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...