Ways to store and query geographic shapes?

4

A lot of the data curiosities that I have with data and reporting run along the lines of “I have this set of geocodable data. I wonder how many data points fall in different geocodable boundaries (wards, police beats, congressional districts, census tracts, neighborhoods, community areas)?”

How have people stored geographic boundary data (like this GIS data from the City of Chicago) so that it can be queried to find out if arbitrary data points fall within the boundaries?

A quick look seemed like using spatial extensions for a database was the way to go with Postgres being somewhat further along than MySQL? What are your thoughts about this.

Tags: asked June 24, 2010

Leave a Reply

5 Answers

6

The conventional wisdom about PostgreSQL being better than MySQL is true. MySQL's spatial support is getting better, but PostGIS, the spatial extension for PostgreSQL, is just plain awesome for all kinds of spatial queries. As J. Heasly mentioned, GeoDjango provides a great API for doing spatial queries, but there are libraries for most languages that provide a wrapper to PostGIS that you can use without being inside a framework. If PostgreSQL isn't an option, SQLite has the SpatiaLite project, which is an excellent choice, too.

  1. I went with GeoDjango + PostGIS and it’s been a pretty good experience. Navigating working with different SRIDs has been a challenge, however, but I think I’m sorting it out.

Leave a Reply

305
3

Since version 1.3.4 MongoDB has geospatial indexing with support for near and bounds queries (circle and rectangle) like:

location = [50,50]
db.places.find( { loc : { $near : location } } )

box = [[40, 40], [60, 60]]
db.places.find({"loc" : {"$within" : {"$box" : box}}})

center = [50, 50]
radius = 10
db.places.find({"loc" : {"$within" : {"$center" : [center, radius]}}})

Leave a Reply

40
2

If you're married to having the data in a database (and not, say, in a desktop GIS app like ArcView or QGIS) postgis + postgresql is, as others have said, the way to go.

Check out the geodjango comparison of spatial function availability (and note footnote 7); it shows why postgis is great:

http://docs.djangoproject.com/en/dev/ref/contrib/gis/db-api/#compatibility-tables

Database backends that only support bounding box comparisons are basically useless for journalism (but, if you need to go that route here's a hack to get you there: http://www.chasedavis.com/2008/mar/17/point-poly-using-mysql-and-python/ ).

You'll also probably wanna read the postgis docs pretty closely (stuff on bostongis is pretty good too). For instance, the && operator only tests "whether the bounding box of one geometry intersects the bounding box of another." For the example you mention you probably wanna use st_contains. And, of course, if you want the output to end up on the web, you'll probably wanna look into geodjango. Google shp2pgsql, gist indexes, st_setsrid and, for python, psycopg2.

FWIW, you can do a heck of a lot with ArcView if you've got it. If you're trying to produce maps for print this is not such a bad way to go. Moreover, you can actually write python scripts to interact with ArcView directly, though the syntax for actually doing so is not intuitive.

  1. FWIW, QGIS also supports python scripting. I don’t know much more about it, as I’m quite happy to use GeoDjango and a python shell…

Leave a Reply

180
1

GeoDjango has a couple dozen ways to slice/dice the geospatial pie.

Leave a Reply

472
0

I'm sure there's about a hundred ways to do this, but I would use postgis and do something like this:

select point.id_field from point, polygon where point.the_geom && polygon.the_geom

the conventional wisdom is that postgres is ahead of mysql when it comes to spatial. I have no idea if that's true, but no reason to doubt it either.

Leave a Reply

40

Your Answer

Please login to post questions.