|
View:
New views
9 Messages
—
Rating Filter:
Alert me
|
|
|
Combining / agglomerating / clustering numbers of adjacent polygonsDear list,
I have been using postgres/postgis for some years, and written a fair amount of spatial SQL and pgsql and R and Perl based queries. However, I've hit a brick wall in my thinking (November time change ;-) perhaps? ) and wondered if someone could suggest a method to help out. I have a spatial data set that varies in density across a region. I have constructed a grid (did it in qgis) and overlayed on the region, and then derived the density of items per grid area. Where the density is >= 3 per area, I retain the grid square in a new table. Then I want to combine the grid squars to form a larger region and (then incidentally to this question) find the centroid of that region where the grid areas touch. Singletons are rejected, and, if, say, I had 10000 original grid squares, I might end up with 1000 clusterings of squares unioned together of arbitrary eventual shapes. Touching here - adjacency - could mean colinear or diagonal with 2 minimum. The maximum # of adjacent grid squares could run to 1000s per unioned area. But, I'm having a hard time to think through how to write the query. There are no columns in common across the grid squares, otherwise it'd be an easy st_union where col_xyz = 'PQR' kind of select, but all I have is the spatial adjacency. I've looked through the archives for the past couple of days, and perhaps I'm just using the wrong search criteria but I haven't seen anything that helps. I can see how to do this programmatically by agglomerating clusters and deleting from the remaining set as I go (and if I did that I would probably export the data and do it externally because it's quite a large data set), but I was really wanting to write a query for it and let it chug. Any thoughts? Kind regards Derek Jones. _______________________________________________ postgis-users mailing list postgis-users@... http://postgis.refractions.net/mailman/listinfo/postgis-users |
|
|
Re: Combining / agglomerating / clustering numbers of adjacent polygons2009/11/4 Derek Jones <scunacc@...>:
> Dear list, > > I have been using postgres/postgis for some years, and written a fair amount > of spatial SQL and pgsql and R and Perl based queries. > > However, I've hit a brick wall in my thinking (November time change ;-) > perhaps? ) and wondered if someone could suggest a method to help out. > > I have a spatial data set that varies in density across a region. > > I have constructed a grid (did it in qgis) and overlayed on the region, and > then derived the density of items per grid area. > > Where the density is >= 3 per area, I retain the grid square in a new table. > > Then I want to combine the grid squars to form a larger region and (then > incidentally to this question) find the centroid of that region where the > grid areas touch. Singletons are rejected, and, if, say, I had 10000 > original grid squares, I might end up with 1000 clusterings of squares > unioned together of arbitrary eventual shapes. Sounds interesting. Are you combining the squares so that you get a roughly "rasterised" shape that represents the denser areas of your spatial data? I'd write a plpgsql function to do it. But rather than use postgis to find the adjacency (in this case because its a), rather use a naming convention for the squares that makes it simple to combine with adjacency based on maybe just an integer x,y coordinate system, where the top left square is 0,0 then next one to its right is 1,0 etc.Then use some algorithms from raster graphics to combine the "pixels" into "polygons". > Touching here - adjacency - could mean colinear or diagonal with 2 minimum. > The maximum # of adjacent grid squares could run to 1000s per unioned area. > > But, I'm having a hard time to think through how to write the query. > > There are no columns in common across the grid squares, otherwise it'd be an > easy st_union where col_xyz = 'PQR' kind of select, but all I have is the > spatial adjacency. > > I've looked through the archives for the past couple of days, and perhaps > I'm just using the wrong search criteria but I haven't seen anything that > helps. > > I can see how to do this programmatically by agglomerating clusters and > deleting from the remaining set as I go (and if I did that I would probably > export the data and do it externally because it's quite a large data set), > but I was really wanting to write a query for it and let it chug. > > Any thoughts? > > > Kind regards > > Derek Jones. > > > _______________________________________________ > postgis-users mailing list > postgis-users@... > http://postgis.refractions.net/mailman/listinfo/postgis-users > -- Brian Modra Land line: +27 23 5411 462 Mobile: +27 79 69 77 082 5 Jan Louw Str, Prince Albert, 6930 Postal: P.O. Box 2, Prince Albert 6930 South Africa http://www.zwartberg.com/ _______________________________________________ postgis-users mailing list postgis-users@... http://postgis.refractions.net/mailman/listinfo/postgis-users |
|
|
Re: Combining / agglomerating / clustering numbers ofadjacent polygonsDerek,
Which version of PostgreSQL are you using? Interestingly we are demonstrating some of these kinds of exercises in our upcoming book "PostGIS in Action" book well actually we are in the middle of writing those chapters now -- and cleaning up some of our examples - so they aren't quite ready to disseminate without a lot of explanation. But all these tricks require the new PostgreSQL 8.4 window function and/or recursive query (common table expression) functionality. Sound like the recursive query might be closer to what you want (where you keep on accumulating based on some condition and then start a new bucket). There are also several techniques for creating on the fly grids in PostGIS which would help out here. Hope that helps, Regina -----Original Message----- From: postgis-users-bounces@... [mailto:postgis-users-bounces@...] On Behalf Of Derek Jones Sent: Tuesday, November 03, 2009 5:27 PM To: postgis-users@... Subject: [postgis-users] Combining / agglomerating / clustering numbers ofadjacent polygons Dear list, I have been using postgres/postgis for some years, and written a fair amount of spatial SQL and pgsql and R and Perl based queries. However, I've hit a brick wall in my thinking (November time change ;-) perhaps? ) and wondered if someone could suggest a method to help out. I have a spatial data set that varies in density across a region. I have constructed a grid (did it in qgis) and overlayed on the region, and then derived the density of items per grid area. Where the density is >= 3 per area, I retain the grid square in a new table. Then I want to combine the grid squars to form a larger region and (then incidentally to this question) find the centroid of that region where the grid areas touch. Singletons are rejected, and, if, say, I had 10000 original grid squares, I might end up with 1000 clusterings of squares unioned together of arbitrary eventual shapes. Touching here - adjacency - could mean colinear or diagonal with 2 minimum. The maximum # of adjacent grid squares could run to 1000s per unioned area. But, I'm having a hard time to think through how to write the query. There are no columns in common across the grid squares, otherwise it'd be an easy st_union where col_xyz = 'PQR' kind of select, but all I have is the spatial adjacency. I've looked through the archives for the past couple of days, and perhaps I'm just using the wrong search criteria but I haven't seen anything that helps. I can see how to do this programmatically by agglomerating clusters and deleting from the remaining set as I go (and if I did that I would probably export the data and do it externally because it's quite a large data set), but I was really wanting to write a query for it and let it chug. Any thoughts? Kind regards Derek Jones. _______________________________________________ postgis-users mailing list postgis-users@... http://postgis.refractions.net/mailman/listinfo/postgis-users _______________________________________________ postgis-users mailing list postgis-users@... http://postgis.refractions.net/mailman/listinfo/postgis-users |
|
|
Re: Combining / agglomerating / clustering numbers ofadjacent polygonsDerek,
I work with a "use case" similar for Site Specific Farming. Right now I'm using statistical clustering without spatial relations. After the clustering i use spatial relationship just to adjust the result. I'm interested in using spatial factor in the clustering, I'm not experts in statistics but perhaps I can help you. Good Luck. Matías On Wed, Nov 4, 2009 at 5:02 AM, Paragon Corporation <lr@...> wrote: > Derek, > > Which version of PostgreSQL are you using? > > Interestingly we are demonstrating some of these kinds of exercises in our > upcoming book "PostGIS in Action" book well actually we are in the middle of > writing those chapters now -- and cleaning up some of our examples - so they > aren't quite ready to disseminate without a lot of explanation. But all > these tricks require the new PostgreSQL 8.4 window function and/or recursive > query (common table expression) functionality. > > Sound like the recursive query might be closer to what you want (where you > keep on accumulating based on some condition and then start a new bucket). > There are also several techniques for creating on the fly grids in PostGIS > which would help out here. > > Hope that helps, > Regina > > > -----Original Message----- > From: postgis-users-bounces@... > [mailto:postgis-users-bounces@...] On Behalf Of Derek > Jones > Sent: Tuesday, November 03, 2009 5:27 PM > To: postgis-users@... > Subject: [postgis-users] Combining / agglomerating / clustering numbers > ofadjacent polygons > > Dear list, > > I have been using postgres/postgis for some years, and written a fair amount > of spatial SQL and pgsql and R and Perl based queries. > > However, I've hit a brick wall in my thinking (November time change ;-) > perhaps? ) and wondered if someone could suggest a method to help out. > > I have a spatial data set that varies in density across a region. > > I have constructed a grid (did it in qgis) and overlayed on the region, and > then derived the density of items per grid area. > > Where the density is >= 3 per area, I retain the grid square in a new table. > > Then I want to combine the grid squars to form a larger region and (then > incidentally to this question) find the centroid of that region where the > grid areas touch. Singletons are rejected, and, if, say, I had 10000 > original grid squares, I might end up with 1000 clusterings of squares > unioned together of arbitrary eventual shapes. > > Touching here - adjacency - could mean colinear or diagonal with 2 minimum. > The maximum # of adjacent grid squares could run to 1000s per unioned area. > > But, I'm having a hard time to think through how to write the query. > > There are no columns in common across the grid squares, otherwise it'd be an > easy st_union where col_xyz = 'PQR' kind of select, but all I have is the > spatial adjacency. > > I've looked through the archives for the past couple of days, and perhaps > I'm just using the wrong search criteria but I haven't seen anything that > helps. > > I can see how to do this programmatically by agglomerating clusters and > deleting from the remaining set as I go (and if I did that I would probably > export the data and do it externally because it's quite a large data set), > but I was really wanting to write a query for it and let it chug. > > Any thoughts? > > > Kind regards > > Derek Jones. > > > _______________________________________________ > postgis-users mailing list > postgis-users@... > http://postgis.refractions.net/mailman/listinfo/postgis-users > > > > _______________________________________________ > postgis-users mailing list > postgis-users@... > http://postgis.refractions.net/mailman/listinfo/postgis-users > -- Lic. Matías Massigoge Cel: 54-11-62047686 www.agerpro.com.ar _______________________________________________ postgis-users mailing list postgis-users@... http://postgis.refractions.net/mailman/listinfo/postgis-users |
|
|
Re: Combining / agglomerating / clustering numbers of adjacent polygonsDear Matias
Thanks for the thought. Any help is always appreciated. I actually am using this agglomeration (when I eventually derive it ;-) ) to seed kmedoids or kmean centroids for statistical clustering, (doing it in R via a variety of methods to see what works best). I already have a much larger set of points that need to be clustered around the set of centroids I generate from this grid agglomeration process. However doing this gridding and agglomeration process first gives me a twofold advantage, in that it not only gives me the centroid for the clustering of the "other" data set, but also allows me to identify the actual spatial areas in which some of my (statistical) cluster points (in the other data set) should fall to start with, so I will do at least one level of spatial intersection, then one level of statistical clustering on top. I am at this point favoring doing the grid agglomeration programmatically - perhaps I can get that done today. (Unless anyone comes up with an SQL solution before I finish coding it up ;-)) I'll let you folks know if I make the programmatic approach work but I'm still open to other input in the meantime. Thanks to all for suggestions thus far. Kind regards Derek. _______________________________________________ postgis-users mailing list postgis-users@... http://postgis.refractions.net/mailman/listinfo/postgis-users |
|
|
SOLVED: Combining / agglomerating / clustering numbers of adjacent polygonsDear all,
I solved the problem with just a handful of lines of SQL in the end... :-P Kinda roundabout thinking but it worked. Here's the solution: grid_squares contains my reduced grid set from the complete grid overlay from a previous select - it's the density table. I also previously removed singletons from it with a check for any grid entries that had 0 touching entities. # Break out the multipolygon grids into polygons create table temp_e as select st_GeometryN ( the_geom, generate_series(1,st_NumGeometries(the_geom)) ) as split_geom from grid_squares; # Break out the polygons into linestrings and agglomerate into areas # Becomes one big area as a multipolygon so ... create table temp_f as select st_buildarea ( st_collect ( st_ExteriorRing(split_geom) ) ) as the_geom from temp_e; # ... afterwards, break out into polygons again # each of which is the clustered set of # polygons comprising the outer boundary agglomeration # of the original grid areas. create table agglom_areas as select st_GeometryN ( the_geom, generate_series(1,st_NumGeometries(the_geom)) ) as split_geom from temp_f; I can probably with a little work reduce the steps but I don't have a need to at this point. This does exactly what I wanted - hope it's useful perhaps to others in the same jam. Appreciate the help given in any event. Now I can use the generated areas to get me some centroids and then both intersect and then cluster the "other" data set I have which is 10s of 1000s of points and different to this one but related to it. Thank you again. Always useful to bounce these kinds of things around... Kind regards Derek Jones _______________________________________________ postgis-users mailing list postgis-users@... http://postgis.refractions.net/mailman/listinfo/postgis-users |
|
|
Re: SOLVED: Combining / agglomerating / clustering numbers of adjacent polygonsDerek, FWIW -- this is a bit faster and shorter to write than your last step. You probably won't notice the difference until you start dealing with huge geometry collections. --REVISED LAST STEP create table agglom_areas as SELECT (ST_Dump(the_geom)).geom As split_geom FROM temp_f; ---YOUR LAST STEP-- create table agglom_areas as select st_GeometryN ( the_geom, generate_series(1,st_NumGeometries(the_geom)) ) as split_geom from temp_f; Leo -----Original Message----- From: postgis-users-bounces@... [mailto:postgis-users-bounces@...] On Behalf Of Derek Jones Sent: Wednesday, November 04, 2009 10:26 PM Cc: PostGIS Users Discussion Subject: [postgis-users] SOLVED: Combining / agglomerating / clustering numbers of adjacent polygons Dear all, I solved the problem with just a handful of lines of SQL in the end... :-P Kinda roundabout thinking but it worked. Here's the solution: grid_squares contains my reduced grid set from the complete grid overlay from a previous select - it's the density table. I also previously removed singletons from it with a check for any grid entries that had 0 touching entities. # Break out the multipolygon grids into polygons create table temp_e as select st_GeometryN ( the_geom, generate_series(1,st_NumGeometries(the_geom)) ) as split_geom from grid_squares; # Break out the polygons into linestrings and agglomerate into areas # Becomes one big area as a multipolygon so ... create table temp_f as select st_buildarea ( st_collect ( st_ExteriorRing(split_geom) ) ) as the_geom from temp_e; # ... afterwards, break out into polygons again # each of which is the clustered set of # polygons comprising the outer boundary agglomeration # of the original grid areas. create table agglom_areas as select st_GeometryN ( the_geom, generate_series(1,st_NumGeometries(the_geom)) ) as split_geom from temp_f; I can probably with a little work reduce the steps but I don't have a need to at this point. This does exactly what I wanted - hope it's useful perhaps to others in the same jam. Appreciate the help given in any event. Now I can use the generated areas to get me some centroids and then both intersect and then cluster the "other" data set I have which is 10s of 1000s of points and different to this one but related to it. Thank you again. Always useful to bounce these kinds of things around... Kind regards Derek Jones _______________________________________________ postgis-users mailing list postgis-users@... http://postgis.refractions.net/mailman/listinfo/postgis-users _______________________________________________ postgis-users mailing list postgis-users@... http://postgis.refractions.net/mailman/listinfo/postgis-users |
|
|
Re: SOLVED: Combining / agglomerating / clustering numbers of adjacent polygonsDear Leo
Hey thanks! Since these and other steps can run for several tens of minutes or even hours because of the data densities, I didn't notice because I left them each cooking and did something else instead, but I appreciate any improvements here because I may well be running the same kind of query elsewhere also. Occam's Razor and all that ;-) Kind regards Derek. Paragon Corporation wrote: > > Derek, > > FWIW -- this is a bit faster and shorter to write than your last step. You > probably won't notice the difference until you start dealing with huge > geometry collections. > > --REVISED LAST STEP > create table > agglom_areas > as > SELECT (ST_Dump(the_geom)).geom As split_geom > FROM temp_f; > > > > ---YOUR LAST STEP-- > > create table > agglom_areas > as > select > st_GeometryN > ( > the_geom, > generate_series(1,st_NumGeometries(the_geom)) > ) as split_geom > from > temp_f; > > > Leo > -----Original Message----- > From: postgis-users-bounces@... > [mailto:postgis-users-bounces@...] On Behalf Of Derek > Jones > Sent: Wednesday, November 04, 2009 10:26 PM > Cc: PostGIS Users Discussion > Subject: [postgis-users] SOLVED: Combining / agglomerating / clustering > numbers of adjacent polygons > > Dear all, > > I solved the problem with just a handful of lines of SQL in the end... :-P > > Kinda roundabout thinking but it worked. Here's the solution: > > grid_squares contains my reduced grid set from the complete grid overlay > from a previous select - it's the density table. I also previously removed > singletons from it with a check for any grid entries that had 0 touching > entities. > > # Break out the multipolygon grids into polygons > > create table > temp_e > as > select > st_GeometryN > ( > the_geom, > generate_series(1,st_NumGeometries(the_geom)) > ) as split_geom > from > grid_squares; > > # Break out the polygons into linestrings and agglomerate into areas # > Becomes one big area as a multipolygon so ... > > create table > temp_f > as > select > st_buildarea > ( > st_collect > ( > st_ExteriorRing(split_geom) > ) > ) as the_geom > from > temp_e; > > # ... afterwards, break out into polygons again # each of which is the > clustered set of # polygons comprising the outer boundary agglomeration # of > the original grid areas. > > create table > agglom_areas > as > select > st_GeometryN > ( > the_geom, > generate_series(1,st_NumGeometries(the_geom)) > ) as split_geom > from > temp_f; > > > > I can probably with a little work reduce the steps but I don't have a need > to at this point. > > This does exactly what I wanted - hope it's useful perhaps to others in the > same jam. Appreciate the help given in any event. > > Now I can use the generated areas to get me some centroids and then both > intersect and then cluster the "other" data set I have which is 10s of 1000s > of points and different to this one but related to it. > > Thank you again. Always useful to bounce these kinds of things around... > > Kind regards > > Derek Jones > _______________________________________________ > postgis-users mailing list > postgis-users@... > http://postgis.refractions.net/mailman/listinfo/postgis-users > > > > _______________________________________________ > postgis-users mailing list > postgis-users@... > http://postgis.refractions.net/mailman/listinfo/postgis-users > postgis-users mailing list postgis-users@... http://postgis.refractions.net/mailman/listinfo/postgis-users |
|
|
|
| Free embeddable forum powered by Nabble | Forum Help |