|
View:
New views
7 Messages
—
Rating Filter:
Alert me
|
|
|
performance-related questionWe have run up against a performance problem with a reasonably large
(but not huge, I would say) data collection. Even the most basic queries are running unacceptably slow on this collection, so I am wondering if there is some thing very obviously broken about our configuration. The machine is not the beefiest: it has a single 2.4 GHz processor and has only 1GB RAM, but I am trying to find out what performance we can wring from it before moving up to a better one. The JVM is allocated 800MB, and I have: <db-connection cacheSize="400M" collectionCache="96M" database="native" files="webapp/WEB-INF/data" pageSize="4096"> in the conf.xml The collection of interest has 27700 documents, of varying size. A large number (say 1/4- to 1/2) are binary (images). None is larger than an article or book chapter. Many are smaller (say a page or two of XML). This query: for $doc in collection ('/bopp.bfldev') return $doc takes 12 seconds to evaluate; the number of results returned is limited to 1 by the client. We need to get < 1 second. for $doc in subsequence(collection ('/bopp.bfldev'),1,1) return $doc takes the same time The log shows only: 2009-11-05 12:06:15,842 [P1-49] DEBUG (XQuery.java [compile]:155) - Query diagnostics: for <5> $doc in collection("/bopp.bfldev") return <6> $doc 2009-11-05 12:06:15,843 [P1-49] DEBUG (XQuery.java [compile]:161) - Compilation took 6 ms 2009-11-05 12:06:27,348 [P1-49] DEBUG (XQuery.java [execute]:231) - Execution took 11,498 ms the returned document is quite small, so I don't think there's a serialization problem My one concern is that possibly there are too many collections: currently about 6400 if the following is a correct measure (it includes collections outside the one we are running the query on, but that is by far the largest of them): charlestown:/proj/exist/eXist> find webapp/WEB-INF/data/fs -type d | wc 6399 6399 402505 Is that likely to cause problems? I could restructure our paths to avoid that Any ideas, folks? I have a checkpoint release tomorrow and would really love to speed this up a bit! Thanks -Mike ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
|
|
Re: performance-related question> for $doc in collection ('/bopp.bfldev')
> return $doc Isnt this just retreiving all 27700 documents? -- Adam Retter eXist Developer { United Kingdom } adam@... irc://irc.freenode.net/existdb ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
|
|
Re: performance-related question> for $doc in collection ('/bopp.bfldev')
> return $doc > > takes 12 seconds to evaluate; the number of results returned is limited > to 1 by the client. A query like this should return instantly. > My one concern is that possibly there are too many collections: > currently about 6400 Ok, that's the only explanation I have. Does the second query execute faster? What happens if you increase the collectionCache setting in conf.xml? If you can't figure it out, I can offer to have a look at your data (unless it's confident) if you send it to me within the next hours. Wolfgang ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
|
|
Re: performance-related question
So I increased the collectionCache setting from 96 to 256 MB and the
speed improved from 12 to 3 seconds. I tried fiddling with it, making
it a bit bigger within the various limits, but that seems to be about
the best I can manage.
Thanks for your offer of help, Wolfgang - I'll get in touch off-list -Mike Wolfgang Meier wrote:
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
|
|
Re: performance-related question
Update for the list:
I believe the problem has been solved by setting collectionCacheSize (which accepts a fixed maximum number of collections to cache), rather than collectionCache (which is supposed to control the cache by a byte size limit, but apparently isn't working right at the moment) thanks, Wolf Mike Sokolov wrote: So I increased the collectionCache setting from 96 to 256 MB and the speed improved from 12 to 3 seconds. I tried fiddling with it, making it a bit bigger within the various limits, but that seems to be about the best I can manage. ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
|
|
Re: performance-related question> I believe the problem has been solved by setting collectionCacheSize (which
> accepts a fixed maximum number of collections to cache), rather than > collectionCache (which is supposed to control the cache by a byte size > limit, but apparently isn't working right at the moment) If someone else experiences issues with queries spanning a few thousand collections or more, here's what we found: the default setting for the collection cache in conf.xml is: <db-connection collectionCache="48M"/> The cache is supposed to grow on demand up to 48M. Unfortunately, this doesn't seem to work in 1.4 (and maybe 1.2.x as well). In my tests, the cache size remained fixed to 128 collections and didn't grow. This causes a lot of IO if there are several thousand collections in the db and results in a significant performance loss at query time. Fortunately, there's an alternative setting which allows us to force the collection cache to a fixed size (specified in terms of collections cached): <db-connection collectionCacheSize="10000"/> So if your DB has 10000 collections, they would all fit into memory. Well, this isn't a perfect solution (as you have no control over the memory consumed), but it's ok as a workaround. I'll try to find a fix for the dynamic cache over the weekend. Wolfgang ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
|
|
Re: performance-related question> I'll try to find a fix for the dynamic cache over the weekend.
I fixed the collection cache. It does now actually grow to the specified limits, so the default setting: <db-connection collectionCache="48M"/> will indeed be sufficient to hold a few thousand collections. You can check the current size via JMX. A few people already reported a significant performance increase :-) Wolfgang ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Exist-open mailing list Exist-open@... https://lists.sourceforge.net/lists/listinfo/exist-open |
| Free embeddable forum powered by Nabble | Forum Help |