|
View:
New views
13 Messages
—
Rating Filter:
Alert me
|
|
|
Gpars making my code slower?Hi all,
catchy title, I know, I also know the problem most likely is on my side but I need help to understand why. Here is a piece of code, executing in the Grails Bootstrap, so whenever the app is started up. I need to load a couple of classpath resources (referenced as spring beans) and load the data into maps which are then stored in the servletcontext. The code took 983ms to load all the data previously. I then began to use parallelizer.eachAsync, hoping to speed things up, but instead it takes anywhere from double the time to even more: def site = ['a', 'b', 'c'] //about 33values here def start = System.currentTimeMillis() Parallelizer.withParallelizer(4) { sites.eachAsync { site -> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") def lines = dtid_imp_map.file.readLines() def themap = [:] lines.each { line -> def (dtid, pv) = line.split(',') themap[dtid] = pv.toLong() } servletContext.setAttribute("dtid_pv_map_${site}", themap) } } println("DTID_PV_MAP loading took ${System.currentTimeMillis() - start}") I know that I am potentially accessing the servletContext from multiple threads the same time, but as the attribute name is never the same, I have hope there is no concurrency issue here... otherwise I guess I'd have to syncrhonize {} it. Above parallelizer was run with no special settings and took about twice as long (no setting = no of threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms is is about 2secs longer than running this without any parallelizer code. So still, leaving my concurrency troubles behind, why is the Parallelizer code slower? Because the ramp up time to get the Threads goiing takes too long? If that is the case, what is the typical time needed to get the Threads going? In my case that time is about 1000ms which sounds far off by my understanding... I think creating threads is cheaper :-) Cheers Sven -- Sven Haiges sven.haiges@... Yahoo Messenger / Skype: hansamann Personal Homepage, Wiki & Blog: http://www.svenhaiges.de Subscribe to the Grails Podcast: http://www.grailspodcast.com --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Gpars making my code slower?Hi Sven,
Please ask this question in GPars list. It will be good to keep knowledge in right place :) Alex On Thu, Oct 22, 2009 at 1:14 AM, Sven Haiges <sven.haiges@...> wrote: > Hi all, > > catchy title, I know, I also know the problem most likely is on my > side but I need help to understand why. Here is a piece of code, > executing in the Grails Bootstrap, so whenever the app is started up. > I need to load a couple of classpath resources (referenced as spring > beans) and load the data into maps which are then stored in the > servletcontext. > > The code took 983ms to load all the data previously. I then began to > use parallelizer.eachAsync, hoping to speed things up, but instead it > takes anywhere from double the time to even more: > > def site = ['a', 'b', 'c'] //about 33values here > > def start = System.currentTimeMillis() > Parallelizer.withParallelizer(4) { > sites.eachAsync { site -> > def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") > def lines = dtid_imp_map.file.readLines() > def themap = [:] > lines.each { line -> > def (dtid, pv) = line.split(',') > themap[dtid] = pv.toLong() > } > servletContext.setAttribute("dtid_pv_map_${site}", themap) > } > } > println("DTID_PV_MAP loading took ${System.currentTimeMillis() > - start}") > > I know that I am potentially accessing the servletContext from > multiple threads the same time, but as the attribute name is never the > same, I have hope there is no concurrency issue here... otherwise I > guess I'd have to syncrhonize {} it. Above parallelizer was run with > no special settings and took about twice as long (no setting = no of > threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms > is is about 2secs longer than running this without any parallelizer > code. > > So still, leaving my concurrency troubles behind, why is the > Parallelizer code slower? Because the ramp up time to get the Threads > goiing takes too long? If that is the case, what is the typical time > needed to get the Threads going? In my case that time is about 1000ms > which sounds far off by my understanding... I think creating threads > is cheaper :-) > > Cheers > Sven > > > -- > Sven Haiges > sven.haiges@... > > Yahoo Messenger / Skype: hansamann > Personal Homepage, Wiki & Blog: http://www.svenhaiges.de > > Subscribe to the Grails Podcast: > http://www.grailspodcast.com > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Designing Your Own Domain-Specific Language in Groovy by Guillaume Laforge at SpringOne/2GX 2009Hi Guillaume,
is the picture on your 10th slide of this talk somewhere freely available ? its hilarious :-) Thanks Andreas -- Andreas Jöcker GiS - Gesellschaft für integrierte Systemplanung mbH Junkersstr. 2 69469 Weinheim E-Mail a.joecker@... Telefon +49 6201 503-59 Fax +49 6201 503-66 Gesellschaft für integrierte Systemplanung mbH Geschäftsführer: Eckhard Haffmann, Alfred Gai, Bernd Heselmann Sitz der Gesellschaft: Zeppelinstr. 11 - 91052 Erlangen Amtsgericht Fürth/Bayern - Handelsregister-Nr. HRB 3435 --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Netiquette -- Re: [groovy-user] Designing Your Own Domain-Specific Language in Groovy by Guillaume Laforge at SpringOne/2GX 2009I believe that Andreas replied to Alex Tkachman's "Re: [groovy-user] Gpars making my code slower?" email and changed the subject to create a new email that had nothing to do with the thread replied to, i.e. a new was created thread by inappropriately usurping an old thread. Instead a new email message should have been created so as to create an actual new thread. Usurping threads like this violates the netiquette of email threads: doing this destroys all the thread structure that those of us using email clients with thread structuring use to great advantage -- as long as people follow the (very simple) rules. Sorry to pick Andreas out individually on this, other do this as well. It is very, very annoying whenever this happens, and this morning I thought I needed to say something about it. On Thu, 2009-10-22 at 08:41 +0200, Andreas Jöcker wrote: > Hi Guillaume, > > is the picture on your 10th slide of this talk somewhere freely available ? > > its hilarious :-) > > Thanks > Andreas > -- Russel. ============================================================================= Dr Russel Winder Partner xmpp: russel@... Concertant LLP t: +44 20 7585 2200, +44 20 7193 9203 41 Buckmaster Road, f: +44 8700 516 084 voip: sip:russel.winder@... London SW11 1EN, UK m: +44 7770 465 077 skype: russel_winder |
|
|
Re: Netiquette -- Re: [groovy-user] Designing Your Own Domain-Specific Language in Groovy by Guillaume Laforge at SpringOne/2GX 2009boah sorry for your netiquette disturbance.
didnt know that there is something like "thread structure in emails...." - so i couldnt follow "this simple rule" and sorry if i disturbed any structure again with this email Cheers the violator PS: next time I adress Guillaume individually for something like that... sorry list Am 22.10.2009 09:45, Russel Winder schrieb: > I believe that Andreas replied to Alex Tkachman's "Re: [groovy-user] > Gpars making my code slower?" email and changed the subject to create a > new email that had nothing to do with the thread replied to, i.e. a new > was created thread by inappropriately usurping an old thread. Instead a > new email message should have been created so as to create an actual new > thread. > > Usurping threads like this violates the netiquette of email threads: > doing this destroys all the thread structure that those of us using > email clients with thread structuring use to great advantage -- as long > as people follow the (very simple) rules. > > Sorry to pick Andreas out individually on this, other do this as well. > It is very, very annoying whenever this happens, and this morning I > thought I needed to say something about it. > > > On Thu, 2009-10-22 at 08:41 +0200, Andreas Jöcker wrote: > >> Hi Guillaume, >> >> is the picture on your 10th slide of this talk somewhere freely available ? >> >> its hilarious :-) >> >> Thanks >> Andreas >> >> -- Andreas Jöcker GiS - Gesellschaft für integrierte Systemplanung mbH Junkersstr. 2 69469 Weinheim E-Mail a.joecker@... Telefon +49 6201 503-59 Fax +49 6201 503-66 Gesellschaft für integrierte Systemplanung mbH Geschäftsführer: Eckhard Haffmann, Alfred Gai, Bernd Heselmann Sitz der Gesellschaft: Zeppelinstr. 11 - 91052 Erlangen Amtsgericht Fürth/Bayern - Handelsregister-Nr. HRB 3435 --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Designing Your Own Domain-Specific Language in Groovy by Guillaume Laforge at SpringOne/2GX 2009Yeah, I think so.
I think the link may be on the last slides, where I've put all the links to the pictures I've included in the presentation. It's a pretty well known comic :-) On Thu, Oct 22, 2009 at 08:41, Andreas Jöcker <a.joecker@...> wrote: > Hi Guillaume, > > is the picture on your 10th slide of this talk somewhere freely available ? > > its hilarious :-) > > Thanks > Andreas > > -- > > Andreas Jöcker > GiS - Gesellschaft für integrierte Systemplanung mbH > Junkersstr. 2 > 69469 Weinheim > > E-Mail a.joecker@... > Telefon +49 6201 503-59 > Fax +49 6201 503-66 > > Gesellschaft für integrierte Systemplanung mbH > Geschäftsführer: Eckhard Haffmann, Alfred Gai, Bernd Heselmann > Sitz der Gesellschaft: Zeppelinstr. 11 - 91052 Erlangen > Amtsgericht Fürth/Bayern - Handelsregister-Nr. HRB 3435 > > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > > -- Guillaume Laforge Groovy Project Manager Head of Groovy Development at SpringSource http://www.springsource.com/g2one --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Gpars making my code slower?Hi Sven,
I'm so glad you jumped into GPars, welcome on-board. I measured the overhead on my a bit oldish dual-core to give you some rough estimates. Bootstrapping the thread pool takes about 30 ms, installing a category (to enable the xxxAsync() methods) takes roughly 190 ms and calling eachAsync() itself takes for about 60 ms longer than calling ordinary each(). All summed up means the direct overhead imposed by using the pool in your particular case could be around 280 ms. This makes me believe you might be able to get to about 700 ms for the parallel variant of your code on a dual core. Not a big win compared to the 1000 ms sequential version, in my opinion, but still much better than the numbers you're currently getting. Since eachAsync behaves as expected and provides adequate speed up for pure CPU-intensive mutually independent calculations without any shared state, the trouble probably lies in the code you're calling in parallel. And since you actually see performance degradation with increasing thread count, I'd suspect one of the methods (maybe ctx.getBean() ) misbehaves when called simultaneously from multiple threads. I could, for example, think of a poorly done lazy initialization, which will be repeated for each calling thread, if the threads ask for a resource roughly at the same time. It is quite difficult do go any further without actually touching the code and measuring each line individually. If you feel brave enough, you may try to experiment with synchronization access to ctx or servletContext, or preinitializing ctx before you call eachAsync. I hope this helps you move forward. Regards, Vaclav Sven Haiges wrote: > Hi all, > > catchy title, I know, I also know the problem most likely is on my > side but I need help to understand why. Here is a piece of code, > executing in the Grails Bootstrap, so whenever the app is started up. > I need to load a couple of classpath resources (referenced as spring > beans) and load the data into maps which are then stored in the > servletcontext. > > The code took 983ms to load all the data previously. I then began to > use parallelizer.eachAsync, hoping to speed things up, but instead it > takes anywhere from double the time to even more: > > def site = ['a', 'b', 'c'] //about 33values here > > def start = System.currentTimeMillis() > Parallelizer.withParallelizer(4) { > sites.eachAsync { site -> > def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") > def lines = dtid_imp_map.file.readLines() > def themap = [:] > lines.each { line -> > def (dtid, pv) = line.split(',') > themap[dtid] = pv.toLong() > } > servletContext.setAttribute("dtid_pv_map_${site}", themap) > } > } > println("DTID_PV_MAP loading took ${System.currentTimeMillis() > - start}") > > I know that I am potentially accessing the servletContext from > multiple threads the same time, but as the attribute name is never the > same, I have hope there is no concurrency issue here... otherwise I > guess I'd have to syncrhonize {} it. Above parallelizer was run with > no special settings and took about twice as long (no setting = no of > threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms > is is about 2secs longer than running this without any parallelizer > code. > > So still, leaving my concurrency troubles behind, why is the > Parallelizer code slower? Because the ramp up time to get the Threads > goiing takes too long? If that is the case, what is the typical time > needed to get the Threads going? In my case that time is about 1000ms > which sounds far off by my understanding... I think creating threads > is cheaper :-) > > Cheers > Sven > > > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Gpars making my code slower?Hi Vaclav,
thanx for the info. It seems I cleared my inbox too radically that's what I am replying that late. I'll be moving future questions into the gpars list but keeping this thread now. The bean access that I wanted to do in parallel and then the processing is for a classpath resource: def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") 'Site' will make sure I do not access the same bean twice, so there could only be some general spring init code that is touched by multiple threads and slows it down... interesting . In case this is interesting, the beans themselves look like this: dtid_pv_map_ca(org.springframework.core.io.ClassPathResource, 'dtid_pv_ca') { } These are files of a couple kilobytes in size that I need to preprocess and load in some way into memory during startup time. I'll experiment a bit, like trying to access the beans first and then eachAsync'ing over them instead of the site values. Thanx Sven On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...> wrote: > Hi Sven, > > I'm so glad you jumped into GPars, welcome on-board. > I measured the overhead on my a bit oldish dual-core to give you some rough > estimates. Bootstrapping the thread pool takes about 30 ms, installing a > category (to enable the xxxAsync() methods) takes roughly 190 ms and calling > eachAsync() itself takes for about 60 ms longer than calling ordinary > each(). All summed up means the direct overhead imposed by using the pool in > your particular case could be around 280 ms. This makes me believe you might > be able to get to about 700 ms for the parallel variant of your code on a > dual core. Not a big win compared to the 1000 ms sequential version, in my > opinion, but still much better than the numbers you're currently getting. > > Since eachAsync behaves as expected and provides adequate speed up for pure > CPU-intensive mutually independent calculations without any shared state, > the trouble probably lies in the code you're calling in parallel. And since > you actually see performance degradation with increasing thread count, I'd > suspect one of the methods (maybe ctx.getBean() ) misbehaves when called > simultaneously from multiple threads. I could, for example, think of a > poorly done lazy initialization, which will be repeated for each calling > thread, if the threads ask for a resource roughly at the same time. > It is quite difficult do go any further without actually touching the code > and measuring each line individually. If you feel brave enough, you may try > to experiment with synchronization access to ctx or servletContext, or > preinitializing ctx before you call eachAsync. > > I hope this helps you move forward. > > Regards, > > Vaclav > > > Sven Haiges wrote: >> >> Hi all, >> >> catchy title, I know, I also know the problem most likely is on my >> side but I need help to understand why. Here is a piece of code, >> executing in the Grails Bootstrap, so whenever the app is started up. >> I need to load a couple of classpath resources (referenced as spring >> beans) and load the data into maps which are then stored in the >> servletcontext. >> >> The code took 983ms to load all the data previously. I then began to >> use parallelizer.eachAsync, hoping to speed things up, but instead it >> takes anywhere from double the time to even more: >> >> def site = ['a', 'b', 'c'] //about 33values here >> >> def start = System.currentTimeMillis() >> Parallelizer.withParallelizer(4) { >> sites.eachAsync { site -> >> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") >> def lines = dtid_imp_map.file.readLines() >> def themap = [:] >> lines.each { line -> >> def (dtid, pv) = line.split(',') >> themap[dtid] = pv.toLong() >> } >> servletContext.setAttribute("dtid_pv_map_${site}", themap) >> } >> } >> println("DTID_PV_MAP loading took ${System.currentTimeMillis() >> - start}") >> >> I know that I am potentially accessing the servletContext from >> multiple threads the same time, but as the attribute name is never the >> same, I have hope there is no concurrency issue here... otherwise I >> guess I'd have to syncrhonize {} it. Above parallelizer was run with >> no special settings and took about twice as long (no setting = no of >> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms >> is is about 2secs longer than running this without any parallelizer >> code. >> >> So still, leaving my concurrency troubles behind, why is the >> Parallelizer code slower? Because the ramp up time to get the Threads >> goiing takes too long? If that is the case, what is the typical time >> needed to get the Threads going? In my case that time is about 1000ms >> which sounds far off by my understanding... I think creating threads >> is cheaper :-) >> >> Cheers >> Sven >> >> >> > > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > > -- Sven Haiges sven.haiges@... Yahoo Messenger / Skype: hansamann Personal Homepage, Wiki & Blog: http://www.svenhaiges.de Subscribe to the Grails Podcast: http://www.grailspodcast.com --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Gpars making my code slower?Hi Vaclav, all,
why is there no eachAsync on a Map? I am just running into this gotcha again, I parsed the site->file mapping into a map and then tried to .eachAsync over it, but it seems only Collections are supported. Is there a specific reason this is not implemented on the Maps? Or does it exist and I just did not see it? Cheers Sven On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges <sven.haiges@...> wrote: > Hi Vaclav, > > thanx for the info. It seems I cleared my inbox too radically that's > what I am replying that late. I'll be moving future questions into the > gpars list but keeping this thread now. > > The bean access that I wanted to do in parallel and then the > processing is for a classpath resource: > > def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") > > 'Site' will make sure I do not access the same bean twice, so there > could only be some general spring init code that is touched by > multiple threads and slows it down... interesting . > > In case this is interesting, the beans themselves look like this: > > dtid_pv_map_ca(org.springframework.core.io.ClassPathResource, 'dtid_pv_ca') { } > > These are files of a couple kilobytes in size that I need to > preprocess and load in some way into memory during startup time. > > I'll experiment a bit, like trying to access the beans first and then > eachAsync'ing over them instead of the site values. > > Thanx > Sven > > On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...> wrote: >> Hi Sven, >> >> I'm so glad you jumped into GPars, welcome on-board. >> I measured the overhead on my a bit oldish dual-core to give you some rough >> estimates. Bootstrapping the thread pool takes about 30 ms, installing a >> category (to enable the xxxAsync() methods) takes roughly 190 ms and calling >> eachAsync() itself takes for about 60 ms longer than calling ordinary >> each(). All summed up means the direct overhead imposed by using the pool in >> your particular case could be around 280 ms. This makes me believe you might >> be able to get to about 700 ms for the parallel variant of your code on a >> dual core. Not a big win compared to the 1000 ms sequential version, in my >> opinion, but still much better than the numbers you're currently getting. >> >> Since eachAsync behaves as expected and provides adequate speed up for pure >> CPU-intensive mutually independent calculations without any shared state, >> the trouble probably lies in the code you're calling in parallel. And since >> you actually see performance degradation with increasing thread count, I'd >> suspect one of the methods (maybe ctx.getBean() ) misbehaves when called >> simultaneously from multiple threads. I could, for example, think of a >> poorly done lazy initialization, which will be repeated for each calling >> thread, if the threads ask for a resource roughly at the same time. >> It is quite difficult do go any further without actually touching the code >> and measuring each line individually. If you feel brave enough, you may try >> to experiment with synchronization access to ctx or servletContext, or >> preinitializing ctx before you call eachAsync. >> >> I hope this helps you move forward. >> >> Regards, >> >> Vaclav >> >> >> Sven Haiges wrote: >>> >>> Hi all, >>> >>> catchy title, I know, I also know the problem most likely is on my >>> side but I need help to understand why. Here is a piece of code, >>> executing in the Grails Bootstrap, so whenever the app is started up. >>> I need to load a couple of classpath resources (referenced as spring >>> beans) and load the data into maps which are then stored in the >>> servletcontext. >>> >>> The code took 983ms to load all the data previously. I then began to >>> use parallelizer.eachAsync, hoping to speed things up, but instead it >>> takes anywhere from double the time to even more: >>> >>> def site = ['a', 'b', 'c'] //about 33values here >>> >>> def start = System.currentTimeMillis() >>> Parallelizer.withParallelizer(4) { >>> sites.eachAsync { site -> >>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") >>> def lines = dtid_imp_map.file.readLines() >>> def themap = [:] >>> lines.each { line -> >>> def (dtid, pv) = line.split(',') >>> themap[dtid] = pv.toLong() >>> } >>> servletContext.setAttribute("dtid_pv_map_${site}", themap) >>> } >>> } >>> println("DTID_PV_MAP loading took ${System.currentTimeMillis() >>> - start}") >>> >>> I know that I am potentially accessing the servletContext from >>> multiple threads the same time, but as the attribute name is never the >>> same, I have hope there is no concurrency issue here... otherwise I >>> guess I'd have to syncrhonize {} it. Above parallelizer was run with >>> no special settings and took about twice as long (no setting = no of >>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms >>> is is about 2secs longer than running this without any parallelizer >>> code. >>> >>> So still, leaving my concurrency troubles behind, why is the >>> Parallelizer code slower? Because the ramp up time to get the Threads >>> goiing takes too long? If that is the case, what is the typical time >>> needed to get the Threads going? In my case that time is about 1000ms >>> which sounds far off by my understanding... I think creating threads >>> is cheaper :-) >>> >>> Cheers >>> Sven >>> >>> >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe from this list, please visit: >> >> http://xircles.codehaus.org/manage_email >> >> >> > > > > -- > Sven Haiges > sven.haiges@... > > Yahoo Messenger / Skype: hansamann > Personal Homepage, Wiki & Blog: http://www.svenhaiges.de > > Subscribe to the Grails Podcast: > http://www.grailspodcast.com > -- Sven Haiges sven.haiges@... Yahoo Messenger / Skype: hansamann Personal Homepage, Wiki & Blog: http://www.svenhaiges.de Subscribe to the Grails Podcast: http://www.grailspodcast.com --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Gpars making my code slower?Hi Sven,
the eachAsync() methods are supposed to work on all objects just like each() does, including strings or maps. The code below works just fine: Parallelizer.withParallelizer { [a:1].eachParallel {println it.value} //notice the method name change in recent gpars :) } Are you sure you invoke the eachAsync() within the 'withParallelizer' block? The withParallelizer block uses the Groovy category mechanism, so it only enhances the calling thread. Maybe you're trying to nest eachAsync()? withParallelizer { images.eachAsync { it.eachAsync() //BANG! No eachAsync() here, it is a different tread, you have to enhance it too, perhaps using withExistingParallelizer(pool) } } I wonder whether this is the case? Cheers, Vaclav Sven Haiges wrote: > Hi Vaclav, all, > > why is there no eachAsync on a Map? I am just running into this gotcha > again, I parsed the site->file mapping into a map and then tried to > .eachAsync over it, but it seems only Collections are supported. Is > there a specific reason this is not implemented on the Maps? Or does > it exist and I just did not see it? > > Cheers > Sven > > On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges <sven.haiges@...> wrote: > >> Hi Vaclav, >> >> thanx for the info. It seems I cleared my inbox too radically that's >> what I am replying that late. I'll be moving future questions into the >> gpars list but keeping this thread now. >> >> The bean access that I wanted to do in parallel and then the >> processing is for a classpath resource: >> >> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") >> >> 'Site' will make sure I do not access the same bean twice, so there >> could only be some general spring init code that is touched by >> multiple threads and slows it down... interesting . >> >> In case this is interesting, the beans themselves look like this: >> >> dtid_pv_map_ca(org.springframework.core.io.ClassPathResource, 'dtid_pv_ca') { } >> >> These are files of a couple kilobytes in size that I need to >> preprocess and load in some way into memory during startup time. >> >> I'll experiment a bit, like trying to access the beans first and then >> eachAsync'ing over them instead of the site values. >> >> Thanx >> Sven >> >> On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...> wrote: >> >>> Hi Sven, >>> >>> I'm so glad you jumped into GPars, welcome on-board. >>> I measured the overhead on my a bit oldish dual-core to give you some rough >>> estimates. Bootstrapping the thread pool takes about 30 ms, installing a >>> category (to enable the xxxAsync() methods) takes roughly 190 ms and calling >>> eachAsync() itself takes for about 60 ms longer than calling ordinary >>> each(). All summed up means the direct overhead imposed by using the pool in >>> your particular case could be around 280 ms. This makes me believe you might >>> be able to get to about 700 ms for the parallel variant of your code on a >>> dual core. Not a big win compared to the 1000 ms sequential version, in my >>> opinion, but still much better than the numbers you're currently getting. >>> >>> Since eachAsync behaves as expected and provides adequate speed up for pure >>> CPU-intensive mutually independent calculations without any shared state, >>> the trouble probably lies in the code you're calling in parallel. And since >>> you actually see performance degradation with increasing thread count, I'd >>> suspect one of the methods (maybe ctx.getBean() ) misbehaves when called >>> simultaneously from multiple threads. I could, for example, think of a >>> poorly done lazy initialization, which will be repeated for each calling >>> thread, if the threads ask for a resource roughly at the same time. >>> It is quite difficult do go any further without actually touching the code >>> and measuring each line individually. If you feel brave enough, you may try >>> to experiment with synchronization access to ctx or servletContext, or >>> preinitializing ctx before you call eachAsync. >>> >>> I hope this helps you move forward. >>> >>> Regards, >>> >>> Vaclav >>> >>> >>> Sven Haiges wrote: >>> >>>> Hi all, >>>> >>>> catchy title, I know, I also know the problem most likely is on my >>>> side but I need help to understand why. Here is a piece of code, >>>> executing in the Grails Bootstrap, so whenever the app is started up. >>>> I need to load a couple of classpath resources (referenced as spring >>>> beans) and load the data into maps which are then stored in the >>>> servletcontext. >>>> >>>> The code took 983ms to load all the data previously. I then began to >>>> use parallelizer.eachAsync, hoping to speed things up, but instead it >>>> takes anywhere from double the time to even more: >>>> >>>> def site = ['a', 'b', 'c'] //about 33values here >>>> >>>> def start = System.currentTimeMillis() >>>> Parallelizer.withParallelizer(4) { >>>> sites.eachAsync { site -> >>>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") >>>> def lines = dtid_imp_map.file.readLines() >>>> def themap = [:] >>>> lines.each { line -> >>>> def (dtid, pv) = line.split(',') >>>> themap[dtid] = pv.toLong() >>>> } >>>> servletContext.setAttribute("dtid_pv_map_${site}", themap) >>>> } >>>> } >>>> println("DTID_PV_MAP loading took ${System.currentTimeMillis() >>>> - start}") >>>> >>>> I know that I am potentially accessing the servletContext from >>>> multiple threads the same time, but as the attribute name is never the >>>> same, I have hope there is no concurrency issue here... otherwise I >>>> guess I'd have to syncrhonize {} it. Above parallelizer was run with >>>> no special settings and took about twice as long (no setting = no of >>>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms >>>> is is about 2secs longer than running this without any parallelizer >>>> code. >>>> >>>> So still, leaving my concurrency troubles behind, why is the >>>> Parallelizer code slower? Because the ramp up time to get the Threads >>>> goiing takes too long? If that is the case, what is the typical time >>>> needed to get the Threads going? In my case that time is about 1000ms >>>> which sounds far off by my understanding... I think creating threads >>>> is cheaper :-) >>>> >>>> Cheers >>>> Sven >>>> >>>> >>>> >>>> >>> --------------------------------------------------------------------- >>> To unsubscribe from this list, please visit: >>> >>> http://xircles.codehaus.org/manage_email >>> >>> >>> >>> >> >> -- >> Sven Haiges >> sven.haiges@... >> >> Yahoo Messenger / Skype: hansamann >> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de >> >> Subscribe to the Grails Podcast: >> http://www.grailspodcast.com >> >> > > > > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Gpars making my code slower?Hi Vaclav,
thanx for th info. I'll try that out and let you know. I was not using eachParallel.. maybe that's the reason then. Cheers Sven On Thu, Oct 22, 2009 at 9:43 PM, Vaclav Pech <vaclav.pech@...> wrote: > Hi Sven, > > the eachAsync() methods are supposed to work on all objects just like each() > does, including strings or maps. The code below works just fine: > Parallelizer.withParallelizer { > [a:1].eachParallel {println it.value} //notice the method name change in > recent gpars :) > } > Are you sure you invoke the eachAsync() within the 'withParallelizer' block? > The withParallelizer block uses the Groovy category mechanism, so it only > enhances the calling thread. > Maybe you're trying to nest eachAsync()? > withParallelizer { > images.eachAsync { > it.eachAsync() //BANG! No eachAsync() here, it is a different > tread, you have to enhance it too, perhaps using > withExistingParallelizer(pool) > } > } > > I wonder whether this is the case? > > Cheers, > > Vaclav > > > > Sven Haiges wrote: >> >> Hi Vaclav, all, >> >> why is there no eachAsync on a Map? I am just running into this gotcha >> again, I parsed the site->file mapping into a map and then tried to >> .eachAsync over it, but it seems only Collections are supported. Is >> there a specific reason this is not implemented on the Maps? Or does >> it exist and I just did not see it? >> >> Cheers >> Sven >> >> On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges <sven.haiges@...> >> wrote: >> >>> >>> Hi Vaclav, >>> >>> thanx for the info. It seems I cleared my inbox too radically that's >>> what I am replying that late. I'll be moving future questions into the >>> gpars list but keeping this thread now. >>> >>> The bean access that I wanted to do in parallel and then the >>> processing is for a classpath resource: >>> >>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") >>> >>> 'Site' will make sure I do not access the same bean twice, so there >>> could only be some general spring init code that is touched by >>> multiple threads and slows it down... interesting . >>> >>> In case this is interesting, the beans themselves look like this: >>> >>> dtid_pv_map_ca(org.springframework.core.io.ClassPathResource, >>> 'dtid_pv_ca') { } >>> >>> These are files of a couple kilobytes in size that I need to >>> preprocess and load in some way into memory during startup time. >>> >>> I'll experiment a bit, like trying to access the beans first and then >>> eachAsync'ing over them instead of the site values. >>> >>> Thanx >>> Sven >>> >>> On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...> >>> wrote: >>> >>>> >>>> Hi Sven, >>>> >>>> I'm so glad you jumped into GPars, welcome on-board. >>>> I measured the overhead on my a bit oldish dual-core to give you some >>>> rough >>>> estimates. Bootstrapping the thread pool takes about 30 ms, installing a >>>> category (to enable the xxxAsync() methods) takes roughly 190 ms and >>>> calling >>>> eachAsync() itself takes for about 60 ms longer than calling ordinary >>>> each(). All summed up means the direct overhead imposed by using the >>>> pool in >>>> your particular case could be around 280 ms. This makes me believe you >>>> might >>>> be able to get to about 700 ms for the parallel variant of your code on >>>> a >>>> dual core. Not a big win compared to the 1000 ms sequential version, in >>>> my >>>> opinion, but still much better than the numbers you're currently >>>> getting. >>>> >>>> Since eachAsync behaves as expected and provides adequate speed up for >>>> pure >>>> CPU-intensive mutually independent calculations without any shared >>>> state, >>>> the trouble probably lies in the code you're calling in parallel. And >>>> since >>>> you actually see performance degradation with increasing thread count, >>>> I'd >>>> suspect one of the methods (maybe ctx.getBean() ) misbehaves when called >>>> simultaneously from multiple threads. I could, for example, think of a >>>> poorly done lazy initialization, which will be repeated for each calling >>>> thread, if the threads ask for a resource roughly at the same time. >>>> It is quite difficult do go any further without actually touching the >>>> code >>>> and measuring each line individually. If you feel brave enough, you may >>>> try >>>> to experiment with synchronization access to ctx or servletContext, or >>>> preinitializing ctx before you call eachAsync. >>>> >>>> I hope this helps you move forward. >>>> >>>> Regards, >>>> >>>> Vaclav >>>> >>>> >>>> Sven Haiges wrote: >>>> >>>>> >>>>> Hi all, >>>>> >>>>> catchy title, I know, I also know the problem most likely is on my >>>>> side but I need help to understand why. Here is a piece of code, >>>>> executing in the Grails Bootstrap, so whenever the app is started up. >>>>> I need to load a couple of classpath resources (referenced as spring >>>>> beans) and load the data into maps which are then stored in the >>>>> servletcontext. >>>>> >>>>> The code took 983ms to load all the data previously. I then began to >>>>> use parallelizer.eachAsync, hoping to speed things up, but instead it >>>>> takes anywhere from double the time to even more: >>>>> >>>>> def site = ['a', 'b', 'c'] //about 33values here >>>>> >>>>> def start = System.currentTimeMillis() >>>>> Parallelizer.withParallelizer(4) { >>>>> sites.eachAsync { site -> >>>>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") >>>>> def lines = dtid_imp_map.file.readLines() >>>>> def themap = [:] >>>>> lines.each { line -> >>>>> def (dtid, pv) = line.split(',') >>>>> themap[dtid] = pv.toLong() >>>>> } >>>>> servletContext.setAttribute("dtid_pv_map_${site}", >>>>> themap) >>>>> } >>>>> } >>>>> println("DTID_PV_MAP loading took ${System.currentTimeMillis() >>>>> - start}") >>>>> >>>>> I know that I am potentially accessing the servletContext from >>>>> multiple threads the same time, but as the attribute name is never the >>>>> same, I have hope there is no concurrency issue here... otherwise I >>>>> guess I'd have to syncrhonize {} it. Above parallelizer was run with >>>>> no special settings and took about twice as long (no setting = no of >>>>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms >>>>> is is about 2secs longer than running this without any parallelizer >>>>> code. >>>>> >>>>> So still, leaving my concurrency troubles behind, why is the >>>>> Parallelizer code slower? Because the ramp up time to get the Threads >>>>> goiing takes too long? If that is the case, what is the typical time >>>>> needed to get the Threads going? In my case that time is about 1000ms >>>>> which sounds far off by my understanding... I think creating threads >>>>> is cheaper :-) >>>>> >>>>> Cheers >>>>> Sven >>>>> >>>>> >>>>> >>>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe from this list, please visit: >>>> >>>> http://xircles.codehaus.org/manage_email >>>> >>>> >>>> >>>> >>> >>> -- >>> Sven Haiges >>> sven.haiges@... >>> >>> Yahoo Messenger / Skype: hansamann >>> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de >>> >>> Subscribe to the Grails Podcast: >>> http://www.grailspodcast.com >>> >>> >> >> >> >> > > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > > -- Sven Haiges sven.haiges@... Yahoo Messenger / Skype: hansamann Personal Homepage, Wiki & Blog: http://www.svenhaiges.de Subscribe to the Grails Podcast: http://www.grailspodcast.com --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Gpars making my code slower?Hi Sven,
sorry for having misguided you. The eachParallel() method is a renamed eachAsync() method and is only available in recent the gpars builds. So stick with eachAsync() for your code. The functionality is identical. BTW, thank you for the gpars poll at http://www.grailspodcast.com/blog/id/523 I am sure it will be pretty helpful. Cheers, Vaclav Sven Haiges wrote: > Hi Vaclav, > > thanx for th info. I'll try that out and let you know. I was not using > eachParallel.. maybe that's the reason then. > > Cheers > Sven > > On Thu, Oct 22, 2009 at 9:43 PM, Vaclav Pech <vaclav.pech@...> wrote: > >> Hi Sven, >> >> the eachAsync() methods are supposed to work on all objects just like each() >> does, including strings or maps. The code below works just fine: >> Parallelizer.withParallelizer { >> [a:1].eachParallel {println it.value} //notice the method name change in >> recent gpars :) >> } >> Are you sure you invoke the eachAsync() within the 'withParallelizer' block? >> The withParallelizer block uses the Groovy category mechanism, so it only >> enhances the calling thread. >> Maybe you're trying to nest eachAsync()? >> withParallelizer { >> images.eachAsync { >> it.eachAsync() //BANG! No eachAsync() here, it is a different >> tread, you have to enhance it too, perhaps using >> withExistingParallelizer(pool) >> } >> } >> >> I wonder whether this is the case? >> >> Cheers, >> >> Vaclav >> >> >> >> Sven Haiges wrote: >> >>> Hi Vaclav, all, >>> >>> why is there no eachAsync on a Map? I am just running into this gotcha >>> again, I parsed the site->file mapping into a map and then tried to >>> .eachAsync over it, but it seems only Collections are supported. Is >>> there a specific reason this is not implemented on the Maps? Or does >>> it exist and I just did not see it? >>> >>> Cheers >>> Sven >>> >>> On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges <sven.haiges@...> >>> wrote: >>> >>> >>>> Hi Vaclav, >>>> >>>> thanx for the info. It seems I cleared my inbox too radically that's >>>> what I am replying that late. I'll be moving future questions into the >>>> gpars list but keeping this thread now. >>>> >>>> The bean access that I wanted to do in parallel and then the >>>> processing is for a classpath resource: >>>> >>>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") >>>> >>>> 'Site' will make sure I do not access the same bean twice, so there >>>> could only be some general spring init code that is touched by >>>> multiple threads and slows it down... interesting . >>>> >>>> In case this is interesting, the beans themselves look like this: >>>> >>>> dtid_pv_map_ca(org.springframework.core.io.ClassPathResource, >>>> 'dtid_pv_ca') { } >>>> >>>> These are files of a couple kilobytes in size that I need to >>>> preprocess and load in some way into memory during startup time. >>>> >>>> I'll experiment a bit, like trying to access the beans first and then >>>> eachAsync'ing over them instead of the site values. >>>> >>>> Thanx >>>> Sven >>>> >>>> On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...> >>>> wrote: >>>> >>>> >>>>> Hi Sven, >>>>> >>>>> I'm so glad you jumped into GPars, welcome on-board. >>>>> I measured the overhead on my a bit oldish dual-core to give you some >>>>> rough >>>>> estimates. Bootstrapping the thread pool takes about 30 ms, installing a >>>>> category (to enable the xxxAsync() methods) takes roughly 190 ms and >>>>> calling >>>>> eachAsync() itself takes for about 60 ms longer than calling ordinary >>>>> each(). All summed up means the direct overhead imposed by using the >>>>> pool in >>>>> your particular case could be around 280 ms. This makes me believe you >>>>> might >>>>> be able to get to about 700 ms for the parallel variant of your code on >>>>> a >>>>> dual core. Not a big win compared to the 1000 ms sequential version, in >>>>> my >>>>> opinion, but still much better than the numbers you're currently >>>>> getting. >>>>> >>>>> Since eachAsync behaves as expected and provides adequate speed up for >>>>> pure >>>>> CPU-intensive mutually independent calculations without any shared >>>>> state, >>>>> the trouble probably lies in the code you're calling in parallel. And >>>>> since >>>>> you actually see performance degradation with increasing thread count, >>>>> I'd >>>>> suspect one of the methods (maybe ctx.getBean() ) misbehaves when called >>>>> simultaneously from multiple threads. I could, for example, think of a >>>>> poorly done lazy initialization, which will be repeated for each calling >>>>> thread, if the threads ask for a resource roughly at the same time. >>>>> It is quite difficult do go any further without actually touching the >>>>> code >>>>> and measuring each line individually. If you feel brave enough, you may >>>>> try >>>>> to experiment with synchronization access to ctx or servletContext, or >>>>> preinitializing ctx before you call eachAsync. >>>>> >>>>> I hope this helps you move forward. >>>>> >>>>> Regards, >>>>> >>>>> Vaclav >>>>> >>>>> >>>>> Sven Haiges wrote: >>>>> >>>>> >>>>>> Hi all, >>>>>> >>>>>> catchy title, I know, I also know the problem most likely is on my >>>>>> side but I need help to understand why. Here is a piece of code, >>>>>> executing in the Grails Bootstrap, so whenever the app is started up. >>>>>> I need to load a couple of classpath resources (referenced as spring >>>>>> beans) and load the data into maps which are then stored in the >>>>>> servletcontext. >>>>>> >>>>>> The code took 983ms to load all the data previously. I then began to >>>>>> use parallelizer.eachAsync, hoping to speed things up, but instead it >>>>>> takes anywhere from double the time to even more: >>>>>> >>>>>> def site = ['a', 'b', 'c'] //about 33values here >>>>>> >>>>>> def start = System.currentTimeMillis() >>>>>> Parallelizer.withParallelizer(4) { >>>>>> sites.eachAsync { site -> >>>>>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") >>>>>> def lines = dtid_imp_map.file.readLines() >>>>>> def themap = [:] >>>>>> lines.each { line -> >>>>>> def (dtid, pv) = line.split(',') >>>>>> themap[dtid] = pv.toLong() >>>>>> } >>>>>> servletContext.setAttribute("dtid_pv_map_${site}", >>>>>> themap) >>>>>> } >>>>>> } >>>>>> println("DTID_PV_MAP loading took ${System.currentTimeMillis() >>>>>> - start}") >>>>>> >>>>>> I know that I am potentially accessing the servletContext from >>>>>> multiple threads the same time, but as the attribute name is never the >>>>>> same, I have hope there is no concurrency issue here... otherwise I >>>>>> guess I'd have to syncrhonize {} it. Above parallelizer was run with >>>>>> no special settings and took about twice as long (no setting = no of >>>>>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms >>>>>> is is about 2secs longer than running this without any parallelizer >>>>>> code. >>>>>> >>>>>> So still, leaving my concurrency troubles behind, why is the >>>>>> Parallelizer code slower? Because the ramp up time to get the Threads >>>>>> goiing takes too long? If that is the case, what is the typical time >>>>>> needed to get the Threads going? In my case that time is about 1000ms >>>>>> which sounds far off by my understanding... I think creating threads >>>>>> is cheaper :-) >>>>>> >>>>>> Cheers >>>>>> Sven >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe from this list, please visit: >>>>> >>>>> http://xircles.codehaus.org/manage_email >>>>> >>>>> >>>>> >>>>> >>>>> >>>> -- >>>> Sven Haiges >>>> sven.haiges@... >>>> >>>> Yahoo Messenger / Skype: hansamann >>>> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de >>>> >>>> Subscribe to the Grails Podcast: >>>> http://www.grailspodcast.com >>>> >>>> >>>> >>> >>> >>> >> --------------------------------------------------------------------- >> To unsubscribe from this list, please visit: >> >> http://xircles.codehaus.org/manage_email >> >> >> >> > > > > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Gpars making my code slower?Thx. I must admit I had not time so far, but I'll think about using
eachAsync in future. Yes, the poll might be interesting. Cheers Sven On Mon, Oct 26, 2009 at 4:42 AM, Vaclav Pech <vaclav.pech@...> wrote: > Hi Sven, > > sorry for having misguided you. The eachParallel() method is a renamed > eachAsync() method and is only available in recent the gpars builds. So > stick with eachAsync() for your code. The functionality is identical. > BTW, thank you for the gpars poll at > http://www.grailspodcast.com/blog/id/523 I am sure it will be pretty > helpful. > > Cheers, > > Vaclav > > > Sven Haiges wrote: >> >> Hi Vaclav, >> >> thanx for th info. I'll try that out and let you know. I was not using >> eachParallel.. maybe that's the reason then. >> >> Cheers >> Sven >> >> On Thu, Oct 22, 2009 at 9:43 PM, Vaclav Pech <vaclav.pech@...> >> wrote: >> >>> >>> Hi Sven, >>> >>> the eachAsync() methods are supposed to work on all objects just like >>> each() >>> does, including strings or maps. The code below works just fine: >>> Parallelizer.withParallelizer { >>> [a:1].eachParallel {println it.value} //notice the method name change >>> in >>> recent gpars :) >>> } >>> Are you sure you invoke the eachAsync() within the 'withParallelizer' >>> block? >>> The withParallelizer block uses the Groovy category mechanism, so it only >>> enhances the calling thread. >>> Maybe you're trying to nest eachAsync()? >>> withParallelizer { >>> images.eachAsync { >>> it.eachAsync() //BANG! No eachAsync() here, it is a different >>> tread, you have to enhance it too, perhaps using >>> withExistingParallelizer(pool) >>> } >>> } >>> >>> I wonder whether this is the case? >>> >>> Cheers, >>> >>> Vaclav >>> >>> >>> >>> Sven Haiges wrote: >>> >>>> >>>> Hi Vaclav, all, >>>> >>>> why is there no eachAsync on a Map? I am just running into this gotcha >>>> again, I parsed the site->file mapping into a map and then tried to >>>> .eachAsync over it, but it seems only Collections are supported. Is >>>> there a specific reason this is not implemented on the Maps? Or does >>>> it exist and I just did not see it? >>>> >>>> Cheers >>>> Sven >>>> >>>> On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges >>>> <sven.haiges@...> >>>> wrote: >>>> >>>> >>>>> >>>>> Hi Vaclav, >>>>> >>>>> thanx for the info. It seems I cleared my inbox too radically that's >>>>> what I am replying that late. I'll be moving future questions into the >>>>> gpars list but keeping this thread now. >>>>> >>>>> The bean access that I wanted to do in parallel and then the >>>>> processing is for a classpath resource: >>>>> >>>>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") >>>>> >>>>> 'Site' will make sure I do not access the same bean twice, so there >>>>> could only be some general spring init code that is touched by >>>>> multiple threads and slows it down... interesting . >>>>> >>>>> In case this is interesting, the beans themselves look like this: >>>>> >>>>> dtid_pv_map_ca(org.springframework.core.io.ClassPathResource, >>>>> 'dtid_pv_ca') { } >>>>> >>>>> These are files of a couple kilobytes in size that I need to >>>>> preprocess and load in some way into memory during startup time. >>>>> >>>>> I'll experiment a bit, like trying to access the beans first and then >>>>> eachAsync'ing over them instead of the site values. >>>>> >>>>> Thanx >>>>> Sven >>>>> >>>>> On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...> >>>>> wrote: >>>>> >>>>> >>>>>> >>>>>> Hi Sven, >>>>>> >>>>>> I'm so glad you jumped into GPars, welcome on-board. >>>>>> I measured the overhead on my a bit oldish dual-core to give you some >>>>>> rough >>>>>> estimates. Bootstrapping the thread pool takes about 30 ms, installing >>>>>> a >>>>>> category (to enable the xxxAsync() methods) takes roughly 190 ms and >>>>>> calling >>>>>> eachAsync() itself takes for about 60 ms longer than calling ordinary >>>>>> each(). All summed up means the direct overhead imposed by using the >>>>>> pool in >>>>>> your particular case could be around 280 ms. This makes me believe you >>>>>> might >>>>>> be able to get to about 700 ms for the parallel variant of your code >>>>>> on >>>>>> a >>>>>> dual core. Not a big win compared to the 1000 ms sequential version, >>>>>> in >>>>>> my >>>>>> opinion, but still much better than the numbers you're currently >>>>>> getting. >>>>>> >>>>>> Since eachAsync behaves as expected and provides adequate speed up for >>>>>> pure >>>>>> CPU-intensive mutually independent calculations without any shared >>>>>> state, >>>>>> the trouble probably lies in the code you're calling in parallel. And >>>>>> since >>>>>> you actually see performance degradation with increasing thread count, >>>>>> I'd >>>>>> suspect one of the methods (maybe ctx.getBean() ) misbehaves when >>>>>> called >>>>>> simultaneously from multiple threads. I could, for example, think of a >>>>>> poorly done lazy initialization, which will be repeated for each >>>>>> calling >>>>>> thread, if the threads ask for a resource roughly at the same time. >>>>>> It is quite difficult do go any further without actually touching the >>>>>> code >>>>>> and measuring each line individually. If you feel brave enough, you >>>>>> may >>>>>> try >>>>>> to experiment with synchronization access to ctx or servletContext, or >>>>>> preinitializing ctx before you call eachAsync. >>>>>> >>>>>> I hope this helps you move forward. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Vaclav >>>>>> >>>>>> >>>>>> Sven Haiges wrote: >>>>>> >>>>>> >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> catchy title, I know, I also know the problem most likely is on my >>>>>>> side but I need help to understand why. Here is a piece of code, >>>>>>> executing in the Grails Bootstrap, so whenever the app is started up. >>>>>>> I need to load a couple of classpath resources (referenced as spring >>>>>>> beans) and load the data into maps which are then stored in the >>>>>>> servletcontext. >>>>>>> >>>>>>> The code took 983ms to load all the data previously. I then began to >>>>>>> use parallelizer.eachAsync, hoping to speed things up, but instead it >>>>>>> takes anywhere from double the time to even more: >>>>>>> >>>>>>> def site = ['a', 'b', 'c'] //about 33values here >>>>>>> >>>>>>> def start = System.currentTimeMillis() >>>>>>> Parallelizer.withParallelizer(4) { >>>>>>> sites.eachAsync { site -> >>>>>>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}") >>>>>>> def lines = dtid_imp_map.file.readLines() >>>>>>> def themap = [:] >>>>>>> lines.each { line -> >>>>>>> def (dtid, pv) = line.split(',') >>>>>>> themap[dtid] = pv.toLong() >>>>>>> } >>>>>>> servletContext.setAttribute("dtid_pv_map_${site}", >>>>>>> themap) >>>>>>> } >>>>>>> } >>>>>>> println("DTID_PV_MAP loading took ${System.currentTimeMillis() >>>>>>> - start}") >>>>>>> >>>>>>> I know that I am potentially accessing the servletContext from >>>>>>> multiple threads the same time, but as the attribute name is never >>>>>>> the >>>>>>> same, I have hope there is no concurrency issue here... otherwise I >>>>>>> guess I'd have to syncrhonize {} it. Above parallelizer was run with >>>>>>> no special settings and took about twice as long (no setting = no of >>>>>>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 >>>>>>> ms >>>>>>> is is about 2secs longer than running this without any parallelizer >>>>>>> code. >>>>>>> >>>>>>> So still, leaving my concurrency troubles behind, why is the >>>>>>> Parallelizer code slower? Because the ramp up time to get the Threads >>>>>>> goiing takes too long? If that is the case, what is the typical time >>>>>>> needed to get the Threads going? In my case that time is about 1000ms >>>>>>> which sounds far off by my understanding... I think creating threads >>>>>>> is cheaper :-) >>>>>>> >>>>>>> Cheers >>>>>>> Sven >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe from this list, please visit: >>>>>> >>>>>> http://xircles.codehaus.org/manage_email >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Sven Haiges >>>>> sven.haiges@... >>>>> >>>>> Yahoo Messenger / Skype: hansamann >>>>> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de >>>>> >>>>> Subscribe to the Grails Podcast: >>>>> http://www.grailspodcast.com >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe from this list, please visit: >>> >>> http://xircles.codehaus.org/manage_email >>> >>> >>> >>> >> >> >> >> > > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > > -- Sven Haiges sven.haiges@... Yahoo Messenger / Skype: hansamann Personal Homepage, Wiki & Blog: http://www.svenhaiges.de Subscribe to the Grails Podcast: http://www.grailspodcast.com --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
| Free embeddable forum powered by Nabble | Forum Help |