Multi Threading

View: New views
11 Messages — Rating Filter:   Alert me  

Multi Threading

by Richard Ive :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I've noticed that the generate_tiles.py script does not support multi threading.

Is there a way you can change the software to multi thread, so it can use 100% CPU instead of just the 25% mine is using?

--
Regards,
Richard Ive.

_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users

Re: Multi Threading

by Warren Vick :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Did anyone manage to answer Richard's question?

 

I'd also like to port generate_tiles.py to use multi-threading. Since tile generate is naturally parallel in nature, it seems ideally suited, but the key question is... is Mapnik thread safe? i.e. Can I have one map object in Python and safely render from it in multiple threads?

 

Regards,

Warren Vick

 

From: mapnik-users-bounces@... [mailto:mapnik-users-bounces@...] On Behalf Of Richard Ive
Sent: 25 August 2009 10:39
To: mapnik-users@...
Subject: [Mapnik-users] Multi Threading

 

I've noticed that the generate_tiles.py script does not support multi threading.

Is there a way you can change the software to multi thread, so it can use 100% CPU instead of just the 25% mine is using?

--
Regards,
Richard Ive.


_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users

Re: Multi Threading

by Warren Vick :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

I've just discovered that Python is a language which does not let multiple threads execute at the same time, because the run parallel runtime support is based on a central lock (the "Global Interpreter Lock"). So, I guess the only way to utilise multiple processors/cores is to run separate processes? Or, do all Python programs running on a system share the same runtime?

 

Regards,

Warren

 

From: mapnik-users-bounces@... [mailto:mapnik-users-bounces@...] On Behalf Of Warren Vick
Sent: 06 September 2009 08:53
To: mapnik-users@...
Subject: Re: [Mapnik-users] Multi Threading

 

Did anyone manage to answer Richard's question?

 

I'd also like to port generate_tiles.py to use multi-threading. Since tile generate is naturally parallel in nature, it seems ideally suited, but the key question is... is Mapnik thread safe? i.e. Can I have one map object in Python and safely render from it in multiple threads?

 

Regards,

Warren Vick

 

From: mapnik-users-bounces@... [mailto:mapnik-users-bounces@...] On Behalf Of Richard Ive
Sent: 25 August 2009 10:39
To: mapnik-users@...
Subject: [Mapnik-users] Multi Threading

 

I've noticed that the generate_tiles.py script does not support multi threading.

Is there a way you can change the software to multi thread, so it can use 100% CPU instead of just the 25% mine is using?

--
Regards,
Richard Ive.


_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users

Re: Multi Threading

by Jon Burgess-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, 2009-09-06 at 09:28 +0100, Warren Vick wrote:
> I've just discovered that Python is a language which does not let
> multiple threads execute at the same time, because the run parallel
> runtime support is based on a central lock (the "Global Interpreter
> Lock"). So, I guess the only way to utilise multiple processors/cores
> is to run separate processes? Or, do all Python programs running on a
> system share the same runtime?

Mapnik can render lots of tiles in parallel via the python bindings. The
GIL is released during the main rendering call so that multiple threads
can run in parallel. I use the feature in the python mod_tile
renderd.py.

The generate_tiles.py has not been updated because it was originally
intended as a simple example program, not a serious tool for rendering
millions of tiles. I'll take a look at adding in some multithreading
into generate_tiles.py.

        Jon


_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users

Re: Multi Threading

by Warren Vick :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Jon,

Thanks for the reply. From what I read about the GIL, I thought it was something inherent in Python and not controllable. Are you sure that releasing the GIL during the Mapnik render is not just declaring to the Python runtime that a switch of thread execution can take place here? i.e. a thread safe area.

Here's a Wiki entry on GIL:
http://en.wikipedia.org/wiki/Global_Interpreter_Lock

The bit I noted was " Applications written in languages with a GIL have to use separate processes (i.e. interpreters) to achieve full concurrency, as each interpreter has its own GIL.". Doesn't this suggest that to use multiple cores/processors for tiling, separate processes are the way to go rather than multiple threads? Threading looks pretty simple in Python so I'll look at doing some performance tests myself.

Regards,
Warren

-----Original Message-----
From: Jon Burgess [mailto:jburgess777@...]
Sent: 06 September 2009 10:46
To: Warren Vick
Cc: mapnik-users@...
Subject: Re: [Mapnik-users] Multi Threading

On Sun, 2009-09-06 at 09:28 +0100, Warren Vick wrote:
> I've just discovered that Python is a language which does not let
> multiple threads execute at the same time, because the run parallel
> runtime support is based on a central lock (the "Global Interpreter
> Lock"). So, I guess the only way to utilise multiple processors/cores
> is to run separate processes? Or, do all Python programs running on a
> system share the same runtime?

Mapnik can render lots of tiles in parallel via the python bindings. The
GIL is released during the main rendering call so that multiple threads
can run in parallel. I use the feature in the python mod_tile
renderd.py.

The generate_tiles.py has not been updated because it was originally
intended as a simple example program, not a serious tool for rendering
millions of tiles. I'll take a look at adding in some multithreading
into generate_tiles.py.

        Jon


_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users

Re: Multi Threading

by Jon Burgess-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, 2009-09-06 at 11:06 +0100, Warren Vick wrote:
> Hi Jon,
>
> Thanks for the reply. From what I read about the GIL, I thought it was something inherent in Python and not controllable. Are you sure that releasing the GIL during the Mapnik render is not just declaring to the Python runtime that a switch of thread execution can take place here? i.e. a thread safe area.
>
> Here's a Wiki entry on GIL:
> http://en.wikipedia.org/wiki/Global_Interpreter_Lock
>
> The bit I noted was " Applications written in languages with a GIL have to use separate processes (i.e. interpreters) to achieve full concurrency, as each interpreter has its own GIL.". Doesn't this suggest that to use multiple cores/processors for tiling, separate processes are the way to go rather than multiple threads? Threading looks pretty simple in Python so I'll look at doing some performance tests myself.

It is true that just releasing the GIL is insufficient, you also need to
write the program to use multiple threads to allow something else to
occur when the GIL has been released. This is why the current
generate_tiles.py code fails to take advantage of the mapnik threading.

It is also true that to achieve full concurrency it is easiest to use
multiple processes since this avoid any problems with the GIL. This is
what we have suggested people should do in the past -- just launch
multiple copies of generate_tiles with a different bbox or zoom range in
each instance.

        Jon



_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users

Re: Multi Threading

by Jon Burgess-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, 2009-09-06 at 10:46 +0100, Jon Burgess wrote:

>
> On Sun, 2009-09-06 at 09:28 +0100, Warren Vick wrote:
> > I've just discovered that Python is a language which does not let
> > multiple threads execute at the same time, because the run parallel
> > runtime support is based on a central lock (the "Global Interpreter
> > Lock"). So, I guess the only way to utilise multiple
> processors/cores
> > is to run separate processes? Or, do all Python programs running on
> a
> > system share the same runtime?
>
> Mapnik can render lots of tiles in parallel via the python bindings.
> The
> GIL is released during the main rendering call so that multiple
> threads
> can run in parallel. I use the feature in the python mod_tile
> renderd.py.
>
> The generate_tiles.py has not been updated because it was originally
> intended as a simple example program, not a serious tool for rendering
> millions of tiles. I'll take a look at adding in some multithreading
> into generate_tiles.py.

I have checked in a version of generate_tiles.py which uses multiple
rendering threads. On a 2 core machine this reduces the rendering time
for the "world 0 - 4" tiles from 35 to 23 seconds.

The new code also makes use of some of the more recent Mapnik features
like the buffer_size and png256 output format.

        Jon


_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users

Re: Multi Threading

by Warren Vick :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Jon,

Interesting... a drop from 35 to 23 seconds seems quite poor so I wonder if there is no concurrency occurring at all, just perhaps a better "packing" of the execution. Alternatively, perhaps the cost of running the threads is quite high. With just two tile render jobs, I doubt if there is an I/O bottleneck. I had a recent experience with running two separate processes (on a quad core machine) which take about an hour each, is that the execution time is still about the same. i.e. almost perfect parallelism.

I'll report the results of my own thread vs. process tests in the next day or two.

Regards,
Warren

-----Original Message-----
From: Jon Burgess [mailto:jburgess777@...]
Sent: 06 September 2009 12:54
To: Warren Vick
Cc: mapnik-users@...
Subject: Re: [Mapnik-users] Multi Threading

On Sun, 2009-09-06 at 10:46 +0100, Jon Burgess wrote:

>
> On Sun, 2009-09-06 at 09:28 +0100, Warren Vick wrote:
> > I've just discovered that Python is a language which does not let
> > multiple threads execute at the same time, because the run parallel
> > runtime support is based on a central lock (the "Global Interpreter
> > Lock"). So, I guess the only way to utilise multiple
> processors/cores
> > is to run separate processes? Or, do all Python programs running on
> a
> > system share the same runtime?
>
> Mapnik can render lots of tiles in parallel via the python bindings.
> The
> GIL is released during the main rendering call so that multiple
> threads
> can run in parallel. I use the feature in the python mod_tile
> renderd.py.
>
> The generate_tiles.py has not been updated because it was originally
> intended as a simple example program, not a serious tool for rendering
> millions of tiles. I'll take a look at adding in some multithreading
> into generate_tiles.py.

I have checked in a version of generate_tiles.py which uses multiple
rendering threads. On a 2 core machine this reduces the rendering time
for the "world 0 - 4" tiles from 35 to 23 seconds.

The new code also makes use of some of the more recent Mapnik features
like the buffer_size and png256 output format.

        Jon


_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users

Re: Multi Threading

by Robert Coup :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Sep 7, 2009 at 2:10 AM, Warren Vick <wvick@...> wrote:

Interesting... a drop from 35 to 23 seconds seems quite poor so I wonder if there is no concurrency occurring at all, just perhaps a better "packing" of the execution. Alternatively, perhaps the cost of running the threads is quite high. With just two tile render jobs, I doubt if there is an I/O bottleneck. I had a recent experience with running two separate processes (on a quad core machine) which take about an hour each, is that the execution time is still about the same. i.e. almost perfect parallelism.

I'll report the results of my own thread vs. process tests in the next day or two.

You could also compare it to the python multiprocessing module (native in Python 2.6+, or available from PyPi for 2.4/2.5). It's interface is virtually the same as the threading module, but it uses separate processes.

Rob :)


_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users

Re: Multi Threading

by Warren Vick :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Thanks for the suggestions, Rob. This definitely sounds like it would be worth investigating. The only problem I can forsee with this is that the current Mapnik build for Windows doesn't work with the latest Python. At least it didn't last time I tried!

 

/W

 

From: Robert Coup [mailto:robert.coup@...]
Sent: 07 September 2009 00:26
To: Warren Vick
Cc: Jon Burgess; mapnik-users@...
Subject: Re: [Mapnik-users] Multi Threading

 

On Mon, Sep 7, 2009 at 2:10 AM, Warren Vick <wvick@...> wrote:


Interesting... a drop from 35 to 23 seconds seems quite poor so I wonder if there is no concurrency occurring at all, just perhaps a better "packing" of the execution. Alternatively, perhaps the cost of running the threads is quite high. With just two tile render jobs, I doubt if there is an I/O bottleneck. I had a recent experience with running two separate processes (on a quad core machine) which take about an hour each, is that the execution time is still about the same. i.e. almost perfect parallelism.

I'll report the results of my own thread vs. process tests in the next day or two.

 

You could also compare it to the python multiprocessing module (native in Python 2.6+, or available from PyPi for 2.4/2.5). It's interface is virtually the same as the threading module, but it uses separate processes.

 

Rob :)

 


_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users

Re: Multi Threading

by Dane Springmeyer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello Warren,

Mapnik works fine with Python 2.6, we've just not packaged that for windows yet. My thinking was that py2.5 was still more commonly used, but ideally we'd provide both.

Either way the multiprocessing package has a windows installer for Python 2.5:


Dane


On Sep 7, 2009, at 1:12 AM, Warren Vick wrote:

Thanks for the suggestions, Rob. This definitely sounds like it would be worth investigating. The only problem I can forsee with this is that the current Mapnik build for Windows doesn't work with the latest Python. At least it didn't last time I tried!
 
/W
 
From: Robert Coup [robert.coup@...] 
Sent: 07 September 2009 00:26
To: Warren Vick
Cc: Jon Burgess; mapnik-users@...
Subject: Re: [Mapnik-users] Multi Threading
 
On Mon, Sep 7, 2009 at 2:10 AM, Warren Vick <wvick@...> wrote:

Interesting... a drop from 35 to 23 seconds seems quite poor so I wonder if there is no concurrency occurring at all, just perhaps a better "packing" of the execution. Alternatively, perhaps the cost of running the threads is quite high. With just two tile render jobs, I doubt if there is an I/O bottleneck. I had a recent experience with running two separate processes (on a quad core machine) which take about an hour each, is that the execution time is still about the same. i.e. almost perfect parallelism.

I'll report the results of my own thread vs. process tests in the next day or two.
 
You could also compare it to the python multiprocessing module (native in Python 2.6+, or available from PyPi for 2.4/2.5). It's interface is virtually the same as the threading module, but it uses separate processes.
 
Rob :)
 
_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users


_______________________________________________
Mapnik-users mailing list
Mapnik-users@...
https://lists.berlios.de/mailman/listinfo/mapnik-users