Gpars making my code slower?

View: New views
13 Messages — Rating Filter:   Alert me  

Gpars making my code slower?

by Sven Haiges-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi all,

catchy title, I know, I also know the problem most likely is on my
side but I need help to understand why.  Here is a piece of code,
executing in the Grails Bootstrap, so whenever the app is started up.
I need to load a couple of classpath resources (referenced as spring
beans) and load the data into maps which are then stored in the
servletcontext.

The code took 983ms to load all the data previously. I then began to
use parallelizer.eachAsync, hoping to speed things up, but instead it
takes anywhere from double the time to even more:

def site = ['a', 'b', 'c'] //about 33values here

        def start = System.currentTimeMillis()
        Parallelizer.withParallelizer(4) {
            sites.eachAsync { site ->
                def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
                def lines = dtid_imp_map.file.readLines()
                def themap = [:]
                lines.each { line ->
                    def (dtid, pv) = line.split(',')
                    themap[dtid] = pv.toLong()
                }
                servletContext.setAttribute("dtid_pv_map_${site}", themap)
            }
        }
        println("DTID_PV_MAP loading took ${System.currentTimeMillis()
- start}")

I know that I am potentially accessing the servletContext from
multiple threads the same time, but as the attribute name is never the
same, I have hope there is no concurrency issue here... otherwise I
guess I'd have to syncrhonize {} it. Above parallelizer was run with
no special settings and took about twice as long (no setting = no of
threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms
is is about 2secs longer than running this without any parallelizer
code.

So still, leaving my concurrency troubles behind, why is the
Parallelizer code slower? Because the ramp up time to get the Threads
goiing takes too long? If that is the case, what is the typical time
needed to get the Threads going? In my case that time is about 1000ms
which sounds far off by my understanding... I think creating threads
is cheaper :-)

Cheers
Sven


--
Sven Haiges
sven.haiges@...

Yahoo Messenger / Skype: hansamann
Personal Homepage, Wiki & Blog: http://www.svenhaiges.de

Subscribe to the Grails Podcast:
http://www.grailspodcast.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Gpars making my code slower?

by Alex Tkachman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Sven,

Please ask this question in GPars list. It will be good to keep
knowledge in right place :)

Alex

On Thu, Oct 22, 2009 at 1:14 AM, Sven Haiges <sven.haiges@...> wrote:

> Hi all,
>
> catchy title, I know, I also know the problem most likely is on my
> side but I need help to understand why.  Here is a piece of code,
> executing in the Grails Bootstrap, so whenever the app is started up.
> I need to load a couple of classpath resources (referenced as spring
> beans) and load the data into maps which are then stored in the
> servletcontext.
>
> The code took 983ms to load all the data previously. I then began to
> use parallelizer.eachAsync, hoping to speed things up, but instead it
> takes anywhere from double the time to even more:
>
> def site = ['a', 'b', 'c'] //about 33values here
>
>        def start = System.currentTimeMillis()
>        Parallelizer.withParallelizer(4) {
>            sites.eachAsync { site ->
>                def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>                def lines = dtid_imp_map.file.readLines()
>                def themap = [:]
>                lines.each { line ->
>                    def (dtid, pv) = line.split(',')
>                    themap[dtid] = pv.toLong()
>                }
>                servletContext.setAttribute("dtid_pv_map_${site}", themap)
>            }
>        }
>        println("DTID_PV_MAP loading took ${System.currentTimeMillis()
> - start}")
>
> I know that I am potentially accessing the servletContext from
> multiple threads the same time, but as the attribute name is never the
> same, I have hope there is no concurrency issue here... otherwise I
> guess I'd have to syncrhonize {} it. Above parallelizer was run with
> no special settings and took about twice as long (no setting = no of
> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms
> is is about 2secs longer than running this without any parallelizer
> code.
>
> So still, leaving my concurrency troubles behind, why is the
> Parallelizer code slower? Because the ramp up time to get the Threads
> goiing takes too long? If that is the case, what is the typical time
> needed to get the Threads going? In my case that time is about 1000ms
> which sounds far off by my understanding... I think creating threads
> is cheaper :-)
>
> Cheers
> Sven
>
>
> --
> Sven Haiges
> sven.haiges@...
>
> Yahoo Messenger / Skype: hansamann
> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de
>
> Subscribe to the Grails Podcast:
> http://www.grailspodcast.com
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>    http://xircles.codehaus.org/manage_email
>
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Designing Your Own Domain-Specific Language in Groovy by Guillaume Laforge at SpringOne/2GX 2009

by Andreas Jöcker :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Guillaume,

is the picture on your 10th slide of this talk somewhere freely available ?

its hilarious :-)

Thanks
Andreas

--

Andreas Jöcker
GiS - Gesellschaft für integrierte Systemplanung mbH
Junkersstr. 2
69469 Weinheim

E-Mail   a.joecker@...
Telefon +49 6201 503-59
Fax     +49 6201 503-66

Gesellschaft für integrierte Systemplanung mbH
Geschäftsführer: Eckhard Haffmann, Alfred Gai, Bernd Heselmann
Sitz der Gesellschaft: Zeppelinstr. 11 - 91052 Erlangen
Amtsgericht Fürth/Bayern - Handelsregister-Nr. HRB 3435


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Netiquette -- Re: [groovy-user] Designing Your Own Domain-Specific Language in Groovy by Guillaume Laforge at SpringOne/2GX 2009

by Russel Winder-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


I believe that Andreas replied to Alex Tkachman's "Re: [groovy-user]
Gpars making my code slower?" email and changed the subject to create a
new email that had nothing to do with the thread replied to, i.e. a new
was created thread by inappropriately usurping an old thread.  Instead a
new email message should have been created so as to create an actual new
thread.

Usurping threads like this violates the netiquette of email threads:
doing this destroys all the thread structure that those of us using
email clients with thread structuring use to great advantage -- as long
as people follow the (very simple) rules.

Sorry to pick Andreas out individually on this, other do this as well.
It is very, very annoying whenever this happens, and this morning I
thought I needed to say something about it.


On Thu, 2009-10-22 at 08:41 +0200, Andreas Jöcker wrote:
> Hi Guillaume,
>
> is the picture on your 10th slide of this talk somewhere freely available ?
>
> its hilarious :-)
>
> Thanks
> Andreas
>
--
Russel.
=============================================================================
Dr Russel Winder      Partner
                                            xmpp: russel@...
Concertant LLP        t: +44 20 7585 2200, +44 20 7193 9203
41 Buckmaster Road,   f: +44 8700 516 084   voip: sip:russel.winder@...
London SW11 1EN, UK   m: +44 7770 465 077   skype: russel_winder


signature.asc (204 bytes) Download Attachment

Re: Netiquette -- Re: [groovy-user] Designing Your Own Domain-Specific Language in Groovy by Guillaume Laforge at SpringOne/2GX 2009

by Andreas Jöcker :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

boah sorry for your netiquette disturbance.

didnt know that there is something like "thread structure in emails...."
- so i couldnt follow "this simple rule"

and sorry if i disturbed any structure again with this email

Cheers
the violator

PS: next time I adress Guillaume individually for something like that...
sorry list

Am 22.10.2009 09:45, Russel Winder schrieb:

> I believe that Andreas replied to Alex Tkachman's "Re: [groovy-user]
> Gpars making my code slower?" email and changed the subject to create a
> new email that had nothing to do with the thread replied to, i.e. a new
> was created thread by inappropriately usurping an old thread.  Instead a
> new email message should have been created so as to create an actual new
> thread.
>
> Usurping threads like this violates the netiquette of email threads:
> doing this destroys all the thread structure that those of us using
> email clients with thread structuring use to great advantage -- as long
> as people follow the (very simple) rules.
>
> Sorry to pick Andreas out individually on this, other do this as well.
> It is very, very annoying whenever this happens, and this morning I
> thought I needed to say something about it.
>
>
> On Thu, 2009-10-22 at 08:41 +0200, Andreas Jöcker wrote:
>  
>> Hi Guillaume,
>>
>> is the picture on your 10th slide of this talk somewhere freely available ?
>>
>> its hilarious :-)
>>
>> Thanks
>> Andreas
>>
>>    


--

Andreas Jöcker
GiS - Gesellschaft für integrierte Systemplanung mbH
Junkersstr. 2
69469 Weinheim

E-Mail   a.joecker@...
Telefon +49 6201 503-59
Fax     +49 6201 503-66

Gesellschaft für integrierte Systemplanung mbH
Geschäftsführer: Eckhard Haffmann, Alfred Gai, Bernd Heselmann
Sitz der Gesellschaft: Zeppelinstr. 11 - 91052 Erlangen
Amtsgericht Fürth/Bayern - Handelsregister-Nr. HRB 3435


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Designing Your Own Domain-Specific Language in Groovy by Guillaume Laforge at SpringOne/2GX 2009

by Guillaume Laforge-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Yeah, I think so.
I think the link may be on the last slides, where I've put all the
links to the pictures I've included in the presentation.
It's a pretty well known comic :-)

On Thu, Oct 22, 2009 at 08:41, Andreas Jöcker
<a.joecker@...> wrote:

> Hi Guillaume,
>
> is the picture on your 10th slide of this talk somewhere freely available ?
>
> its hilarious :-)
>
> Thanks
> Andreas
>
> --
>
> Andreas Jöcker
> GiS - Gesellschaft für integrierte Systemplanung mbH
> Junkersstr. 2
> 69469 Weinheim
>
> E-Mail   a.joecker@...
> Telefon +49 6201 503-59
> Fax     +49 6201 503-66
>
> Gesellschaft für integrierte Systemplanung mbH
> Geschäftsführer: Eckhard Haffmann, Alfred Gai, Bernd Heselmann
> Sitz der Gesellschaft: Zeppelinstr. 11 - 91052 Erlangen
> Amtsgericht Fürth/Bayern - Handelsregister-Nr. HRB 3435
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>    http://xircles.codehaus.org/manage_email
>
>
>



--
Guillaume Laforge
Groovy Project Manager
Head of Groovy Development at SpringSource
http://www.springsource.com/g2one

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Gpars making my code slower?

by Vaclav Pech :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Sven,

I'm so glad you jumped into GPars, welcome on-board.
I measured the overhead on my a bit oldish dual-core to give you some
rough estimates. Bootstrapping the thread pool takes about 30 ms,
installing a category (to enable the xxxAsync() methods) takes roughly
190 ms and calling eachAsync() itself takes for about 60 ms longer than
calling ordinary each(). All summed up means the direct overhead imposed
by using the pool in your particular case could be around 280 ms. This
makes me believe you might be able to get to about 700 ms for the
parallel variant of your code on a dual core. Not a big win compared to
the 1000 ms sequential version, in my opinion, but still much better
than the numbers you're currently getting.

Since eachAsync behaves as expected and provides adequate speed up for
pure CPU-intensive mutually independent calculations without any shared
state, the trouble probably lies in the code you're calling in parallel.
And since you actually see performance degradation with increasing
thread count, I'd suspect one of the methods (maybe ctx.getBean() )
misbehaves when called simultaneously from multiple threads. I could,
for example, think of a poorly done lazy initialization, which will be
repeated for each calling thread, if the threads ask for a resource
roughly at the same time.
It is quite difficult do go any further without actually touching the
code and measuring each line individually. If you feel brave enough, you
may try to experiment with synchronization access to ctx or
servletContext, or preinitializing ctx before you call eachAsync.

I hope this helps you move forward.

Regards,

Vaclav


Sven Haiges wrote:

> Hi all,
>
> catchy title, I know, I also know the problem most likely is on my
> side but I need help to understand why.  Here is a piece of code,
> executing in the Grails Bootstrap, so whenever the app is started up.
> I need to load a couple of classpath resources (referenced as spring
> beans) and load the data into maps which are then stored in the
> servletcontext.
>
> The code took 983ms to load all the data previously. I then began to
> use parallelizer.eachAsync, hoping to speed things up, but instead it
> takes anywhere from double the time to even more:
>
> def site = ['a', 'b', 'c'] //about 33values here
>
>         def start = System.currentTimeMillis()
>         Parallelizer.withParallelizer(4) {
>             sites.eachAsync { site ->
>                 def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>                 def lines = dtid_imp_map.file.readLines()
>                 def themap = [:]
>                 lines.each { line ->
>                     def (dtid, pv) = line.split(',')
>                     themap[dtid] = pv.toLong()
>                 }
>                 servletContext.setAttribute("dtid_pv_map_${site}", themap)
>             }
>         }
>         println("DTID_PV_MAP loading took ${System.currentTimeMillis()
> - start}")
>
> I know that I am potentially accessing the servletContext from
> multiple threads the same time, but as the attribute name is never the
> same, I have hope there is no concurrency issue here... otherwise I
> guess I'd have to syncrhonize {} it. Above parallelizer was run with
> no special settings and took about twice as long (no setting = no of
> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms
> is is about 2secs longer than running this without any parallelizer
> code.
>
> So still, leaving my concurrency troubles behind, why is the
> Parallelizer code slower? Because the ramp up time to get the Threads
> goiing takes too long? If that is the case, what is the typical time
> needed to get the Threads going? In my case that time is about 1000ms
> which sounds far off by my understanding... I think creating threads
> is cheaper :-)
>
> Cheers
> Sven
>
>
>  


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Gpars making my code slower?

by Sven Haiges-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Vaclav,

thanx for the info. It seems I cleared my inbox too radically that's
what I am replying that late. I'll be moving future questions into the
gpars list but keeping this thread now.

The bean access that I wanted to do in parallel and then the
processing is for a classpath resource:

def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")

'Site' will make sure I do not access the same bean twice, so there
could only be some general spring init code that is touched by
multiple threads and slows it down... interesting .

In case this is interesting, the beans themselves look like this:

dtid_pv_map_ca(org.springframework.core.io.ClassPathResource, 'dtid_pv_ca') { }

These are files of a couple kilobytes in size that I need to
preprocess and load in some way into memory during startup time.

I'll experiment a bit, like trying to access the beans first and then
eachAsync'ing over them instead of the site values.

Thanx
Sven

On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...> wrote:

> Hi Sven,
>
> I'm so glad you jumped into GPars, welcome on-board.
> I measured the overhead on my a bit oldish dual-core to give you some rough
> estimates. Bootstrapping the thread pool takes about 30 ms, installing a
> category (to enable the xxxAsync() methods) takes roughly 190 ms and calling
> eachAsync() itself takes for about 60 ms longer than calling ordinary
> each(). All summed up means the direct overhead imposed by using the pool in
> your particular case could be around 280 ms. This makes me believe you might
> be able to get to about 700 ms for the parallel variant of your code on a
> dual core. Not a big win compared to the 1000 ms sequential version, in my
> opinion, but still much better than the numbers you're currently getting.
>
> Since eachAsync behaves as expected and provides adequate speed up for pure
> CPU-intensive mutually independent calculations without any shared state,
> the trouble probably lies in the code you're calling in parallel. And since
> you actually see performance degradation with increasing thread count, I'd
> suspect one of the methods (maybe ctx.getBean() ) misbehaves when called
> simultaneously from multiple threads. I could, for example, think of a
> poorly done lazy initialization, which will be repeated for each calling
> thread, if the threads ask for a resource roughly at the same time.
> It is quite difficult do go any further without actually touching the code
> and measuring each line individually. If you feel brave enough, you may try
> to experiment with synchronization access to ctx or servletContext, or
> preinitializing ctx before you call eachAsync.
>
> I hope this helps you move forward.
>
> Regards,
>
> Vaclav
>
>
> Sven Haiges wrote:
>>
>> Hi all,
>>
>> catchy title, I know, I also know the problem most likely is on my
>> side but I need help to understand why.  Here is a piece of code,
>> executing in the Grails Bootstrap, so whenever the app is started up.
>> I need to load a couple of classpath resources (referenced as spring
>> beans) and load the data into maps which are then stored in the
>> servletcontext.
>>
>> The code took 983ms to load all the data previously. I then began to
>> use parallelizer.eachAsync, hoping to speed things up, but instead it
>> takes anywhere from double the time to even more:
>>
>> def site = ['a', 'b', 'c'] //about 33values here
>>
>>        def start = System.currentTimeMillis()
>>        Parallelizer.withParallelizer(4) {
>>            sites.eachAsync { site ->
>>                def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>                def lines = dtid_imp_map.file.readLines()
>>                def themap = [:]
>>                lines.each { line ->
>>                    def (dtid, pv) = line.split(',')
>>                    themap[dtid] = pv.toLong()
>>                }
>>                servletContext.setAttribute("dtid_pv_map_${site}", themap)
>>            }
>>        }
>>        println("DTID_PV_MAP loading took ${System.currentTimeMillis()
>> - start}")
>>
>> I know that I am potentially accessing the servletContext from
>> multiple threads the same time, but as the attribute name is never the
>> same, I have hope there is no concurrency issue here... otherwise I
>> guess I'd have to syncrhonize {} it. Above parallelizer was run with
>> no special settings and took about twice as long (no setting = no of
>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms
>> is is about 2secs longer than running this without any parallelizer
>> code.
>>
>> So still, leaving my concurrency troubles behind, why is the
>> Parallelizer code slower? Because the ramp up time to get the Threads
>> goiing takes too long? If that is the case, what is the typical time
>> needed to get the Threads going? In my case that time is about 1000ms
>> which sounds far off by my understanding... I think creating threads
>> is cheaper :-)
>>
>> Cheers
>> Sven
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>   http://xircles.codehaus.org/manage_email
>
>
>



--
Sven Haiges
sven.haiges@...

Yahoo Messenger / Skype: hansamann
Personal Homepage, Wiki & Blog: http://www.svenhaiges.de

Subscribe to the Grails Podcast:
http://www.grailspodcast.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Gpars making my code slower?

by Sven Haiges-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Vaclav, all,

why is there no eachAsync on a Map? I am just running into this gotcha
again, I parsed the site->file mapping into a map and then tried to
.eachAsync over it, but it seems only Collections are supported. Is
there a specific reason this is not implemented on the Maps? Or does
it exist and I just did not see it?

Cheers
Sven

On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges <sven.haiges@...> wrote:

> Hi Vaclav,
>
> thanx for the info. It seems I cleared my inbox too radically that's
> what I am replying that late. I'll be moving future questions into the
> gpars list but keeping this thread now.
>
> The bean access that I wanted to do in parallel and then the
> processing is for a classpath resource:
>
> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>
> 'Site' will make sure I do not access the same bean twice, so there
> could only be some general spring init code that is touched by
> multiple threads and slows it down... interesting .
>
> In case this is interesting, the beans themselves look like this:
>
> dtid_pv_map_ca(org.springframework.core.io.ClassPathResource, 'dtid_pv_ca') { }
>
> These are files of a couple kilobytes in size that I need to
> preprocess and load in some way into memory during startup time.
>
> I'll experiment a bit, like trying to access the beans first and then
> eachAsync'ing over them instead of the site values.
>
> Thanx
> Sven
>
> On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...> wrote:
>> Hi Sven,
>>
>> I'm so glad you jumped into GPars, welcome on-board.
>> I measured the overhead on my a bit oldish dual-core to give you some rough
>> estimates. Bootstrapping the thread pool takes about 30 ms, installing a
>> category (to enable the xxxAsync() methods) takes roughly 190 ms and calling
>> eachAsync() itself takes for about 60 ms longer than calling ordinary
>> each(). All summed up means the direct overhead imposed by using the pool in
>> your particular case could be around 280 ms. This makes me believe you might
>> be able to get to about 700 ms for the parallel variant of your code on a
>> dual core. Not a big win compared to the 1000 ms sequential version, in my
>> opinion, but still much better than the numbers you're currently getting.
>>
>> Since eachAsync behaves as expected and provides adequate speed up for pure
>> CPU-intensive mutually independent calculations without any shared state,
>> the trouble probably lies in the code you're calling in parallel. And since
>> you actually see performance degradation with increasing thread count, I'd
>> suspect one of the methods (maybe ctx.getBean() ) misbehaves when called
>> simultaneously from multiple threads. I could, for example, think of a
>> poorly done lazy initialization, which will be repeated for each calling
>> thread, if the threads ask for a resource roughly at the same time.
>> It is quite difficult do go any further without actually touching the code
>> and measuring each line individually. If you feel brave enough, you may try
>> to experiment with synchronization access to ctx or servletContext, or
>> preinitializing ctx before you call eachAsync.
>>
>> I hope this helps you move forward.
>>
>> Regards,
>>
>> Vaclav
>>
>>
>> Sven Haiges wrote:
>>>
>>> Hi all,
>>>
>>> catchy title, I know, I also know the problem most likely is on my
>>> side but I need help to understand why.  Here is a piece of code,
>>> executing in the Grails Bootstrap, so whenever the app is started up.
>>> I need to load a couple of classpath resources (referenced as spring
>>> beans) and load the data into maps which are then stored in the
>>> servletcontext.
>>>
>>> The code took 983ms to load all the data previously. I then began to
>>> use parallelizer.eachAsync, hoping to speed things up, but instead it
>>> takes anywhere from double the time to even more:
>>>
>>> def site = ['a', 'b', 'c'] //about 33values here
>>>
>>>        def start = System.currentTimeMillis()
>>>        Parallelizer.withParallelizer(4) {
>>>            sites.eachAsync { site ->
>>>                def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>>                def lines = dtid_imp_map.file.readLines()
>>>                def themap = [:]
>>>                lines.each { line ->
>>>                    def (dtid, pv) = line.split(',')
>>>                    themap[dtid] = pv.toLong()
>>>                }
>>>                servletContext.setAttribute("dtid_pv_map_${site}", themap)
>>>            }
>>>        }
>>>        println("DTID_PV_MAP loading took ${System.currentTimeMillis()
>>> - start}")
>>>
>>> I know that I am potentially accessing the servletContext from
>>> multiple threads the same time, but as the attribute name is never the
>>> same, I have hope there is no concurrency issue here... otherwise I
>>> guess I'd have to syncrhonize {} it. Above parallelizer was run with
>>> no special settings and took about twice as long (no setting = no of
>>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms
>>> is is about 2secs longer than running this without any parallelizer
>>> code.
>>>
>>> So still, leaving my concurrency troubles behind, why is the
>>> Parallelizer code slower? Because the ramp up time to get the Threads
>>> goiing takes too long? If that is the case, what is the typical time
>>> needed to get the Threads going? In my case that time is about 1000ms
>>> which sounds far off by my understanding... I think creating threads
>>> is cheaper :-)
>>>
>>> Cheers
>>> Sven
>>>
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>   http://xircles.codehaus.org/manage_email
>>
>>
>>
>
>
>
> --
> Sven Haiges
> sven.haiges@...
>
> Yahoo Messenger / Skype: hansamann
> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de
>
> Subscribe to the Grails Podcast:
> http://www.grailspodcast.com
>



--
Sven Haiges
sven.haiges@...

Yahoo Messenger / Skype: hansamann
Personal Homepage, Wiki & Blog: http://www.svenhaiges.de

Subscribe to the Grails Podcast:
http://www.grailspodcast.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Gpars making my code slower?

by Vaclav Pech :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Sven,

the eachAsync() methods are supposed to work on all objects just like
each() does, including strings or maps. The code below works just fine:
Parallelizer.withParallelizer {
    [a:1].eachParallel {println it.value}  //notice the method name
change in recent gpars :)
}
Are you sure you invoke the eachAsync() within the 'withParallelizer'
block? The withParallelizer block uses the Groovy category mechanism, so
it only enhances the calling thread.
Maybe you're trying to nest eachAsync()?
withParallelizer {
    images.eachAsync {
        it.eachAsync()    //BANG! No eachAsync() here, it is a different
tread, you have to enhance it too, perhaps using
withExistingParallelizer(pool)
    }
}

I wonder whether this is the case?

Cheers,

Vaclav



Sven Haiges wrote:

> Hi Vaclav, all,
>
> why is there no eachAsync on a Map? I am just running into this gotcha
> again, I parsed the site->file mapping into a map and then tried to
> .eachAsync over it, but it seems only Collections are supported. Is
> there a specific reason this is not implemented on the Maps? Or does
> it exist and I just did not see it?
>
> Cheers
> Sven
>
> On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges <sven.haiges@...> wrote:
>  
>> Hi Vaclav,
>>
>> thanx for the info. It seems I cleared my inbox too radically that's
>> what I am replying that late. I'll be moving future questions into the
>> gpars list but keeping this thread now.
>>
>> The bean access that I wanted to do in parallel and then the
>> processing is for a classpath resource:
>>
>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>
>> 'Site' will make sure I do not access the same bean twice, so there
>> could only be some general spring init code that is touched by
>> multiple threads and slows it down... interesting .
>>
>> In case this is interesting, the beans themselves look like this:
>>
>> dtid_pv_map_ca(org.springframework.core.io.ClassPathResource, 'dtid_pv_ca') { }
>>
>> These are files of a couple kilobytes in size that I need to
>> preprocess and load in some way into memory during startup time.
>>
>> I'll experiment a bit, like trying to access the beans first and then
>> eachAsync'ing over them instead of the site values.
>>
>> Thanx
>> Sven
>>
>> On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...> wrote:
>>    
>>> Hi Sven,
>>>
>>> I'm so glad you jumped into GPars, welcome on-board.
>>> I measured the overhead on my a bit oldish dual-core to give you some rough
>>> estimates. Bootstrapping the thread pool takes about 30 ms, installing a
>>> category (to enable the xxxAsync() methods) takes roughly 190 ms and calling
>>> eachAsync() itself takes for about 60 ms longer than calling ordinary
>>> each(). All summed up means the direct overhead imposed by using the pool in
>>> your particular case could be around 280 ms. This makes me believe you might
>>> be able to get to about 700 ms for the parallel variant of your code on a
>>> dual core. Not a big win compared to the 1000 ms sequential version, in my
>>> opinion, but still much better than the numbers you're currently getting.
>>>
>>> Since eachAsync behaves as expected and provides adequate speed up for pure
>>> CPU-intensive mutually independent calculations without any shared state,
>>> the trouble probably lies in the code you're calling in parallel. And since
>>> you actually see performance degradation with increasing thread count, I'd
>>> suspect one of the methods (maybe ctx.getBean() ) misbehaves when called
>>> simultaneously from multiple threads. I could, for example, think of a
>>> poorly done lazy initialization, which will be repeated for each calling
>>> thread, if the threads ask for a resource roughly at the same time.
>>> It is quite difficult do go any further without actually touching the code
>>> and measuring each line individually. If you feel brave enough, you may try
>>> to experiment with synchronization access to ctx or servletContext, or
>>> preinitializing ctx before you call eachAsync.
>>>
>>> I hope this helps you move forward.
>>>
>>> Regards,
>>>
>>> Vaclav
>>>
>>>
>>> Sven Haiges wrote:
>>>      
>>>> Hi all,
>>>>
>>>> catchy title, I know, I also know the problem most likely is on my
>>>> side but I need help to understand why.  Here is a piece of code,
>>>> executing in the Grails Bootstrap, so whenever the app is started up.
>>>> I need to load a couple of classpath resources (referenced as spring
>>>> beans) and load the data into maps which are then stored in the
>>>> servletcontext.
>>>>
>>>> The code took 983ms to load all the data previously. I then began to
>>>> use parallelizer.eachAsync, hoping to speed things up, but instead it
>>>> takes anywhere from double the time to even more:
>>>>
>>>> def site = ['a', 'b', 'c'] //about 33values here
>>>>
>>>>        def start = System.currentTimeMillis()
>>>>        Parallelizer.withParallelizer(4) {
>>>>            sites.eachAsync { site ->
>>>>                def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>>>                def lines = dtid_imp_map.file.readLines()
>>>>                def themap = [:]
>>>>                lines.each { line ->
>>>>                    def (dtid, pv) = line.split(',')
>>>>                    themap[dtid] = pv.toLong()
>>>>                }
>>>>                servletContext.setAttribute("dtid_pv_map_${site}", themap)
>>>>            }
>>>>        }
>>>>        println("DTID_PV_MAP loading took ${System.currentTimeMillis()
>>>> - start}")
>>>>
>>>> I know that I am potentially accessing the servletContext from
>>>> multiple threads the same time, but as the attribute name is never the
>>>> same, I have hope there is no concurrency issue here... otherwise I
>>>> guess I'd have to syncrhonize {} it. Above parallelizer was run with
>>>> no special settings and took about twice as long (no setting = no of
>>>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms
>>>> is is about 2secs longer than running this without any parallelizer
>>>> code.
>>>>
>>>> So still, leaving my concurrency troubles behind, why is the
>>>> Parallelizer code slower? Because the ramp up time to get the Threads
>>>> goiing takes too long? If that is the case, what is the typical time
>>>> needed to get the Threads going? In my case that time is about 1000ms
>>>> which sounds far off by my understanding... I think creating threads
>>>> is cheaper :-)
>>>>
>>>> Cheers
>>>> Sven
>>>>
>>>>
>>>>
>>>>        
>>> ---------------------------------------------------------------------
>>> To unsubscribe from this list, please visit:
>>>
>>>   http://xircles.codehaus.org/manage_email
>>>
>>>
>>>
>>>      
>>
>> --
>> Sven Haiges
>> sven.haiges@...
>>
>> Yahoo Messenger / Skype: hansamann
>> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de
>>
>> Subscribe to the Grails Podcast:
>> http://www.grailspodcast.com
>>
>>    
>
>
>
>  


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Gpars making my code slower?

by Sven Haiges-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Vaclav,

thanx for th info. I'll try that out and let you know. I was not using
eachParallel.. maybe that's the reason then.

Cheers
Sven

On Thu, Oct 22, 2009 at 9:43 PM, Vaclav Pech <vaclav.pech@...> wrote:

> Hi Sven,
>
> the eachAsync() methods are supposed to work on all objects just like each()
> does, including strings or maps. The code below works just fine:
> Parallelizer.withParallelizer {
>   [a:1].eachParallel {println it.value}  //notice the method name change in
> recent gpars :)
> }
> Are you sure you invoke the eachAsync() within the 'withParallelizer' block?
> The withParallelizer block uses the Groovy category mechanism, so it only
> enhances the calling thread.
> Maybe you're trying to nest eachAsync()?
> withParallelizer {
>   images.eachAsync {
>       it.eachAsync()    //BANG! No eachAsync() here, it is a different
> tread, you have to enhance it too, perhaps using
> withExistingParallelizer(pool)
>   }
> }
>
> I wonder whether this is the case?
>
> Cheers,
>
> Vaclav
>
>
>
> Sven Haiges wrote:
>>
>> Hi Vaclav, all,
>>
>> why is there no eachAsync on a Map? I am just running into this gotcha
>> again, I parsed the site->file mapping into a map and then tried to
>> .eachAsync over it, but it seems only Collections are supported. Is
>> there a specific reason this is not implemented on the Maps? Or does
>> it exist and I just did not see it?
>>
>> Cheers
>> Sven
>>
>> On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges <sven.haiges@...>
>> wrote:
>>
>>>
>>> Hi Vaclav,
>>>
>>> thanx for the info. It seems I cleared my inbox too radically that's
>>> what I am replying that late. I'll be moving future questions into the
>>> gpars list but keeping this thread now.
>>>
>>> The bean access that I wanted to do in parallel and then the
>>> processing is for a classpath resource:
>>>
>>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>>
>>> 'Site' will make sure I do not access the same bean twice, so there
>>> could only be some general spring init code that is touched by
>>> multiple threads and slows it down... interesting .
>>>
>>> In case this is interesting, the beans themselves look like this:
>>>
>>> dtid_pv_map_ca(org.springframework.core.io.ClassPathResource,
>>> 'dtid_pv_ca') { }
>>>
>>> These are files of a couple kilobytes in size that I need to
>>> preprocess and load in some way into memory during startup time.
>>>
>>> I'll experiment a bit, like trying to access the beans first and then
>>> eachAsync'ing over them instead of the site values.
>>>
>>> Thanx
>>> Sven
>>>
>>> On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...>
>>> wrote:
>>>
>>>>
>>>> Hi Sven,
>>>>
>>>> I'm so glad you jumped into GPars, welcome on-board.
>>>> I measured the overhead on my a bit oldish dual-core to give you some
>>>> rough
>>>> estimates. Bootstrapping the thread pool takes about 30 ms, installing a
>>>> category (to enable the xxxAsync() methods) takes roughly 190 ms and
>>>> calling
>>>> eachAsync() itself takes for about 60 ms longer than calling ordinary
>>>> each(). All summed up means the direct overhead imposed by using the
>>>> pool in
>>>> your particular case could be around 280 ms. This makes me believe you
>>>> might
>>>> be able to get to about 700 ms for the parallel variant of your code on
>>>> a
>>>> dual core. Not a big win compared to the 1000 ms sequential version, in
>>>> my
>>>> opinion, but still much better than the numbers you're currently
>>>> getting.
>>>>
>>>> Since eachAsync behaves as expected and provides adequate speed up for
>>>> pure
>>>> CPU-intensive mutually independent calculations without any shared
>>>> state,
>>>> the trouble probably lies in the code you're calling in parallel. And
>>>> since
>>>> you actually see performance degradation with increasing thread count,
>>>> I'd
>>>> suspect one of the methods (maybe ctx.getBean() ) misbehaves when called
>>>> simultaneously from multiple threads. I could, for example, think of a
>>>> poorly done lazy initialization, which will be repeated for each calling
>>>> thread, if the threads ask for a resource roughly at the same time.
>>>> It is quite difficult do go any further without actually touching the
>>>> code
>>>> and measuring each line individually. If you feel brave enough, you may
>>>> try
>>>> to experiment with synchronization access to ctx or servletContext, or
>>>> preinitializing ctx before you call eachAsync.
>>>>
>>>> I hope this helps you move forward.
>>>>
>>>> Regards,
>>>>
>>>> Vaclav
>>>>
>>>>
>>>> Sven Haiges wrote:
>>>>
>>>>>
>>>>> Hi all,
>>>>>
>>>>> catchy title, I know, I also know the problem most likely is on my
>>>>> side but I need help to understand why.  Here is a piece of code,
>>>>> executing in the Grails Bootstrap, so whenever the app is started up.
>>>>> I need to load a couple of classpath resources (referenced as spring
>>>>> beans) and load the data into maps which are then stored in the
>>>>> servletcontext.
>>>>>
>>>>> The code took 983ms to load all the data previously. I then began to
>>>>> use parallelizer.eachAsync, hoping to speed things up, but instead it
>>>>> takes anywhere from double the time to even more:
>>>>>
>>>>> def site = ['a', 'b', 'c'] //about 33values here
>>>>>
>>>>>       def start = System.currentTimeMillis()
>>>>>       Parallelizer.withParallelizer(4) {
>>>>>           sites.eachAsync { site ->
>>>>>               def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>>>>               def lines = dtid_imp_map.file.readLines()
>>>>>               def themap = [:]
>>>>>               lines.each { line ->
>>>>>                   def (dtid, pv) = line.split(',')
>>>>>                   themap[dtid] = pv.toLong()
>>>>>               }
>>>>>               servletContext.setAttribute("dtid_pv_map_${site}",
>>>>> themap)
>>>>>           }
>>>>>       }
>>>>>       println("DTID_PV_MAP loading took ${System.currentTimeMillis()
>>>>> - start}")
>>>>>
>>>>> I know that I am potentially accessing the servletContext from
>>>>> multiple threads the same time, but as the attribute name is never the
>>>>> same, I have hope there is no concurrency issue here... otherwise I
>>>>> guess I'd have to syncrhonize {} it. Above parallelizer was run with
>>>>> no special settings and took about twice as long (no setting = no of
>>>>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms
>>>>> is is about 2secs longer than running this without any parallelizer
>>>>> code.
>>>>>
>>>>> So still, leaving my concurrency troubles behind, why is the
>>>>> Parallelizer code slower? Because the ramp up time to get the Threads
>>>>> goiing takes too long? If that is the case, what is the typical time
>>>>> needed to get the Threads going? In my case that time is about 1000ms
>>>>> which sounds far off by my understanding... I think creating threads
>>>>> is cheaper :-)
>>>>>
>>>>> Cheers
>>>>> Sven
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe from this list, please visit:
>>>>
>>>>  http://xircles.codehaus.org/manage_email
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Sven Haiges
>>> sven.haiges@...
>>>
>>> Yahoo Messenger / Skype: hansamann
>>> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de
>>>
>>> Subscribe to the Grails Podcast:
>>> http://www.grailspodcast.com
>>>
>>>
>>
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>   http://xircles.codehaus.org/manage_email
>
>
>



--
Sven Haiges
sven.haiges@...

Yahoo Messenger / Skype: hansamann
Personal Homepage, Wiki & Blog: http://www.svenhaiges.de

Subscribe to the Grails Podcast:
http://www.grailspodcast.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Gpars making my code slower?

by Vaclav Pech :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Sven,

sorry for having misguided you. The eachParallel() method is a renamed
eachAsync() method and is only available in recent the gpars builds. So
stick with eachAsync() for your code. The functionality is identical.
BTW, thank you for the gpars poll at
http://www.grailspodcast.com/blog/id/523 I am sure it will be pretty
helpful.

Cheers,

Vaclav


Sven Haiges wrote:

> Hi Vaclav,
>
> thanx for th info. I'll try that out and let you know. I was not using
> eachParallel.. maybe that's the reason then.
>
> Cheers
> Sven
>
> On Thu, Oct 22, 2009 at 9:43 PM, Vaclav Pech <vaclav.pech@...> wrote:
>  
>> Hi Sven,
>>
>> the eachAsync() methods are supposed to work on all objects just like each()
>> does, including strings or maps. The code below works just fine:
>> Parallelizer.withParallelizer {
>>   [a:1].eachParallel {println it.value}  //notice the method name change in
>> recent gpars :)
>> }
>> Are you sure you invoke the eachAsync() within the 'withParallelizer' block?
>> The withParallelizer block uses the Groovy category mechanism, so it only
>> enhances the calling thread.
>> Maybe you're trying to nest eachAsync()?
>> withParallelizer {
>>   images.eachAsync {
>>       it.eachAsync()    //BANG! No eachAsync() here, it is a different
>> tread, you have to enhance it too, perhaps using
>> withExistingParallelizer(pool)
>>   }
>> }
>>
>> I wonder whether this is the case?
>>
>> Cheers,
>>
>> Vaclav
>>
>>
>>
>> Sven Haiges wrote:
>>    
>>> Hi Vaclav, all,
>>>
>>> why is there no eachAsync on a Map? I am just running into this gotcha
>>> again, I parsed the site->file mapping into a map and then tried to
>>> .eachAsync over it, but it seems only Collections are supported. Is
>>> there a specific reason this is not implemented on the Maps? Or does
>>> it exist and I just did not see it?
>>>
>>> Cheers
>>> Sven
>>>
>>> On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges <sven.haiges@...>
>>> wrote:
>>>
>>>      
>>>> Hi Vaclav,
>>>>
>>>> thanx for the info. It seems I cleared my inbox too radically that's
>>>> what I am replying that late. I'll be moving future questions into the
>>>> gpars list but keeping this thread now.
>>>>
>>>> The bean access that I wanted to do in parallel and then the
>>>> processing is for a classpath resource:
>>>>
>>>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>>>
>>>> 'Site' will make sure I do not access the same bean twice, so there
>>>> could only be some general spring init code that is touched by
>>>> multiple threads and slows it down... interesting .
>>>>
>>>> In case this is interesting, the beans themselves look like this:
>>>>
>>>> dtid_pv_map_ca(org.springframework.core.io.ClassPathResource,
>>>> 'dtid_pv_ca') { }
>>>>
>>>> These are files of a couple kilobytes in size that I need to
>>>> preprocess and load in some way into memory during startup time.
>>>>
>>>> I'll experiment a bit, like trying to access the beans first and then
>>>> eachAsync'ing over them instead of the site values.
>>>>
>>>> Thanx
>>>> Sven
>>>>
>>>> On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...>
>>>> wrote:
>>>>
>>>>        
>>>>> Hi Sven,
>>>>>
>>>>> I'm so glad you jumped into GPars, welcome on-board.
>>>>> I measured the overhead on my a bit oldish dual-core to give you some
>>>>> rough
>>>>> estimates. Bootstrapping the thread pool takes about 30 ms, installing a
>>>>> category (to enable the xxxAsync() methods) takes roughly 190 ms and
>>>>> calling
>>>>> eachAsync() itself takes for about 60 ms longer than calling ordinary
>>>>> each(). All summed up means the direct overhead imposed by using the
>>>>> pool in
>>>>> your particular case could be around 280 ms. This makes me believe you
>>>>> might
>>>>> be able to get to about 700 ms for the parallel variant of your code on
>>>>> a
>>>>> dual core. Not a big win compared to the 1000 ms sequential version, in
>>>>> my
>>>>> opinion, but still much better than the numbers you're currently
>>>>> getting.
>>>>>
>>>>> Since eachAsync behaves as expected and provides adequate speed up for
>>>>> pure
>>>>> CPU-intensive mutually independent calculations without any shared
>>>>> state,
>>>>> the trouble probably lies in the code you're calling in parallel. And
>>>>> since
>>>>> you actually see performance degradation with increasing thread count,
>>>>> I'd
>>>>> suspect one of the methods (maybe ctx.getBean() ) misbehaves when called
>>>>> simultaneously from multiple threads. I could, for example, think of a
>>>>> poorly done lazy initialization, which will be repeated for each calling
>>>>> thread, if the threads ask for a resource roughly at the same time.
>>>>> It is quite difficult do go any further without actually touching the
>>>>> code
>>>>> and measuring each line individually. If you feel brave enough, you may
>>>>> try
>>>>> to experiment with synchronization access to ctx or servletContext, or
>>>>> preinitializing ctx before you call eachAsync.
>>>>>
>>>>> I hope this helps you move forward.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Vaclav
>>>>>
>>>>>
>>>>> Sven Haiges wrote:
>>>>>
>>>>>          
>>>>>> Hi all,
>>>>>>
>>>>>> catchy title, I know, I also know the problem most likely is on my
>>>>>> side but I need help to understand why.  Here is a piece of code,
>>>>>> executing in the Grails Bootstrap, so whenever the app is started up.
>>>>>> I need to load a couple of classpath resources (referenced as spring
>>>>>> beans) and load the data into maps which are then stored in the
>>>>>> servletcontext.
>>>>>>
>>>>>> The code took 983ms to load all the data previously. I then began to
>>>>>> use parallelizer.eachAsync, hoping to speed things up, but instead it
>>>>>> takes anywhere from double the time to even more:
>>>>>>
>>>>>> def site = ['a', 'b', 'c'] //about 33values here
>>>>>>
>>>>>>       def start = System.currentTimeMillis()
>>>>>>       Parallelizer.withParallelizer(4) {
>>>>>>           sites.eachAsync { site ->
>>>>>>               def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>>>>>               def lines = dtid_imp_map.file.readLines()
>>>>>>               def themap = [:]
>>>>>>               lines.each { line ->
>>>>>>                   def (dtid, pv) = line.split(',')
>>>>>>                   themap[dtid] = pv.toLong()
>>>>>>               }
>>>>>>               servletContext.setAttribute("dtid_pv_map_${site}",
>>>>>> themap)
>>>>>>           }
>>>>>>       }
>>>>>>       println("DTID_PV_MAP loading took ${System.currentTimeMillis()
>>>>>> - start}")
>>>>>>
>>>>>> I know that I am potentially accessing the servletContext from
>>>>>> multiple threads the same time, but as the attribute name is never the
>>>>>> same, I have hope there is no concurrency issue here... otherwise I
>>>>>> guess I'd have to syncrhonize {} it. Above parallelizer was run with
>>>>>> no special settings and took about twice as long (no setting = no of
>>>>>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800 ms
>>>>>> is is about 2secs longer than running this without any parallelizer
>>>>>> code.
>>>>>>
>>>>>> So still, leaving my concurrency troubles behind, why is the
>>>>>> Parallelizer code slower? Because the ramp up time to get the Threads
>>>>>> goiing takes too long? If that is the case, what is the typical time
>>>>>> needed to get the Threads going? In my case that time is about 1000ms
>>>>>> which sounds far off by my understanding... I think creating threads
>>>>>> is cheaper :-)
>>>>>>
>>>>>> Cheers
>>>>>> Sven
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>            
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe from this list, please visit:
>>>>>
>>>>>  http://xircles.codehaus.org/manage_email
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>          
>>>> --
>>>> Sven Haiges
>>>> sven.haiges@...
>>>>
>>>> Yahoo Messenger / Skype: hansamann
>>>> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de
>>>>
>>>> Subscribe to the Grails Podcast:
>>>> http://www.grailspodcast.com
>>>>
>>>>
>>>>        
>>>
>>>
>>>      
>> ---------------------------------------------------------------------
>> To unsubscribe from this list, please visit:
>>
>>   http://xircles.codehaus.org/manage_email
>>
>>
>>
>>    
>
>
>
>  


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Gpars making my code slower?

by Sven Haiges-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thx. I must admit I had not time so far, but I'll think about using
eachAsync in future.

Yes, the poll might be interesting.

Cheers
Sven

On Mon, Oct 26, 2009 at 4:42 AM, Vaclav Pech <vaclav.pech@...> wrote:

> Hi Sven,
>
> sorry for having misguided you. The eachParallel() method is a renamed
> eachAsync() method and is only available in recent the gpars builds. So
> stick with eachAsync() for your code. The functionality is identical.
> BTW, thank you for the gpars poll at
> http://www.grailspodcast.com/blog/id/523 I am sure it will be pretty
> helpful.
>
> Cheers,
>
> Vaclav
>
>
> Sven Haiges wrote:
>>
>> Hi Vaclav,
>>
>> thanx for th info. I'll try that out and let you know. I was not using
>> eachParallel.. maybe that's the reason then.
>>
>> Cheers
>> Sven
>>
>> On Thu, Oct 22, 2009 at 9:43 PM, Vaclav Pech <vaclav.pech@...>
>> wrote:
>>
>>>
>>> Hi Sven,
>>>
>>> the eachAsync() methods are supposed to work on all objects just like
>>> each()
>>> does, including strings or maps. The code below works just fine:
>>> Parallelizer.withParallelizer {
>>>  [a:1].eachParallel {println it.value}  //notice the method name change
>>> in
>>> recent gpars :)
>>> }
>>> Are you sure you invoke the eachAsync() within the 'withParallelizer'
>>> block?
>>> The withParallelizer block uses the Groovy category mechanism, so it only
>>> enhances the calling thread.
>>> Maybe you're trying to nest eachAsync()?
>>> withParallelizer {
>>>  images.eachAsync {
>>>      it.eachAsync()    //BANG! No eachAsync() here, it is a different
>>> tread, you have to enhance it too, perhaps using
>>> withExistingParallelizer(pool)
>>>  }
>>> }
>>>
>>> I wonder whether this is the case?
>>>
>>> Cheers,
>>>
>>> Vaclav
>>>
>>>
>>>
>>> Sven Haiges wrote:
>>>
>>>>
>>>> Hi Vaclav, all,
>>>>
>>>> why is there no eachAsync on a Map? I am just running into this gotcha
>>>> again, I parsed the site->file mapping into a map and then tried to
>>>> .eachAsync over it, but it seems only Collections are supported. Is
>>>> there a specific reason this is not implemented on the Maps? Or does
>>>> it exist and I just did not see it?
>>>>
>>>> Cheers
>>>> Sven
>>>>
>>>> On Thu, Oct 22, 2009 at 4:28 PM, Sven Haiges
>>>> <sven.haiges@...>
>>>> wrote:
>>>>
>>>>
>>>>>
>>>>> Hi Vaclav,
>>>>>
>>>>> thanx for the info. It seems I cleared my inbox too radically that's
>>>>> what I am replying that late. I'll be moving future questions into the
>>>>> gpars list but keeping this thread now.
>>>>>
>>>>> The bean access that I wanted to do in parallel and then the
>>>>> processing is for a classpath resource:
>>>>>
>>>>> def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>>>>
>>>>> 'Site' will make sure I do not access the same bean twice, so there
>>>>> could only be some general spring init code that is touched by
>>>>> multiple threads and slows it down... interesting .
>>>>>
>>>>> In case this is interesting, the beans themselves look like this:
>>>>>
>>>>> dtid_pv_map_ca(org.springframework.core.io.ClassPathResource,
>>>>> 'dtid_pv_ca') { }
>>>>>
>>>>> These are files of a couple kilobytes in size that I need to
>>>>> preprocess and load in some way into memory during startup time.
>>>>>
>>>>> I'll experiment a bit, like trying to access the beans first and then
>>>>> eachAsync'ing over them instead of the site values.
>>>>>
>>>>> Thanx
>>>>> Sven
>>>>>
>>>>> On Thu, Oct 22, 2009 at 8:12 AM, Vaclav Pech <vaclav.pech@...>
>>>>> wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> Hi Sven,
>>>>>>
>>>>>> I'm so glad you jumped into GPars, welcome on-board.
>>>>>> I measured the overhead on my a bit oldish dual-core to give you some
>>>>>> rough
>>>>>> estimates. Bootstrapping the thread pool takes about 30 ms, installing
>>>>>> a
>>>>>> category (to enable the xxxAsync() methods) takes roughly 190 ms and
>>>>>> calling
>>>>>> eachAsync() itself takes for about 60 ms longer than calling ordinary
>>>>>> each(). All summed up means the direct overhead imposed by using the
>>>>>> pool in
>>>>>> your particular case could be around 280 ms. This makes me believe you
>>>>>> might
>>>>>> be able to get to about 700 ms for the parallel variant of your code
>>>>>> on
>>>>>> a
>>>>>> dual core. Not a big win compared to the 1000 ms sequential version,
>>>>>> in
>>>>>> my
>>>>>> opinion, but still much better than the numbers you're currently
>>>>>> getting.
>>>>>>
>>>>>> Since eachAsync behaves as expected and provides adequate speed up for
>>>>>> pure
>>>>>> CPU-intensive mutually independent calculations without any shared
>>>>>> state,
>>>>>> the trouble probably lies in the code you're calling in parallel. And
>>>>>> since
>>>>>> you actually see performance degradation with increasing thread count,
>>>>>> I'd
>>>>>> suspect one of the methods (maybe ctx.getBean() ) misbehaves when
>>>>>> called
>>>>>> simultaneously from multiple threads. I could, for example, think of a
>>>>>> poorly done lazy initialization, which will be repeated for each
>>>>>> calling
>>>>>> thread, if the threads ask for a resource roughly at the same time.
>>>>>> It is quite difficult do go any further without actually touching the
>>>>>> code
>>>>>> and measuring each line individually. If you feel brave enough, you
>>>>>> may
>>>>>> try
>>>>>> to experiment with synchronization access to ctx or servletContext, or
>>>>>> preinitializing ctx before you call eachAsync.
>>>>>>
>>>>>> I hope this helps you move forward.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Vaclav
>>>>>>
>>>>>>
>>>>>> Sven Haiges wrote:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> catchy title, I know, I also know the problem most likely is on my
>>>>>>> side but I need help to understand why.  Here is a piece of code,
>>>>>>> executing in the Grails Bootstrap, so whenever the app is started up.
>>>>>>> I need to load a couple of classpath resources (referenced as spring
>>>>>>> beans) and load the data into maps which are then stored in the
>>>>>>> servletcontext.
>>>>>>>
>>>>>>> The code took 983ms to load all the data previously. I then began to
>>>>>>> use parallelizer.eachAsync, hoping to speed things up, but instead it
>>>>>>> takes anywhere from double the time to even more:
>>>>>>>
>>>>>>> def site = ['a', 'b', 'c'] //about 33values here
>>>>>>>
>>>>>>>      def start = System.currentTimeMillis()
>>>>>>>      Parallelizer.withParallelizer(4) {
>>>>>>>          sites.eachAsync { site ->
>>>>>>>              def dtid_imp_map = ctx.getBean("dtid_pv_map_${site}")
>>>>>>>              def lines = dtid_imp_map.file.readLines()
>>>>>>>              def themap = [:]
>>>>>>>              lines.each { line ->
>>>>>>>                  def (dtid, pv) = line.split(',')
>>>>>>>                  themap[dtid] = pv.toLong()
>>>>>>>              }
>>>>>>>              servletContext.setAttribute("dtid_pv_map_${site}",
>>>>>>> themap)
>>>>>>>          }
>>>>>>>      }
>>>>>>>      println("DTID_PV_MAP loading took ${System.currentTimeMillis()
>>>>>>> - start}")
>>>>>>>
>>>>>>> I know that I am potentially accessing the servletContext from
>>>>>>> multiple threads the same time, but as the attribute name is never
>>>>>>> the
>>>>>>> same, I have hope there is no concurrency issue here... otherwise I
>>>>>>> guess I'd have to syncrhonize {} it. Above parallelizer was run with
>>>>>>> no special settings and took about twice as long (no setting = no of
>>>>>>> threads = 2 on my macbook pro, 2 cores). With 4 it takes about 2800
>>>>>>> ms
>>>>>>> is is about 2secs longer than running this without any parallelizer
>>>>>>> code.
>>>>>>>
>>>>>>> So still, leaving my concurrency troubles behind, why is the
>>>>>>> Parallelizer code slower? Because the ramp up time to get the Threads
>>>>>>> goiing takes too long? If that is the case, what is the typical time
>>>>>>> needed to get the Threads going? In my case that time is about 1000ms
>>>>>>> which sounds far off by my understanding... I think creating threads
>>>>>>> is cheaper :-)
>>>>>>>
>>>>>>> Cheers
>>>>>>> Sven
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe from this list, please visit:
>>>>>>
>>>>>>  http://xircles.codehaus.org/manage_email
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Sven Haiges
>>>>> sven.haiges@...
>>>>>
>>>>> Yahoo Messenger / Skype: hansamann
>>>>> Personal Homepage, Wiki & Blog: http://www.svenhaiges.de
>>>>>
>>>>> Subscribe to the Grails Podcast:
>>>>> http://www.grailspodcast.com
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe from this list, please visit:
>>>
>>>  http://xircles.codehaus.org/manage_email
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>   http://xircles.codehaus.org/manage_email
>
>
>



--
Sven Haiges
sven.haiges@...

Yahoo Messenger / Skype: hansamann
Personal Homepage, Wiki & Blog: http://www.svenhaiges.de

Subscribe to the Grails Podcast:
http://www.grailspodcast.com

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email