8.6 Lab — PostGIS Queries End to End
Spin up PostGIS in Docker, load real data, and answer analyst-style questions in SQL.
Key takeaways
- You can stand up a working PostGIS environment in under ten minutes.
- Even analyst-style questions become one-liners once the schema is in place.
- The same SQL patterns scale from 10 features to 10 million with proper indexing.
Introduction
This lab puts Module 8's material into action. You'll run PostGIS locally, load OpenStreetMap data for a city, and answer five realistic analyst questions entirely in SQL.
Prerequisites
- Docker installed (or a local PostGIS install).
- ~10 GB free disk space.
psqlcommand-line client (comes with PostgreSQL / Postgres.app /brew install libpq).- About an hour.
Step 1 — Spin up PostGIS
1docker run --name gisdb \
2 -e POSTGRES_PASSWORD=gis123 \
3 -p 5432:5432 \
4 -v gis_data:/var/lib/postgresql/data \
5 -d postgis/postgis:16-3.4
6[object Object]
7Inside psql:
1CREATE DATABASE gisdb;
2\c gisdb
3CREATE EXTENSION postgis;
4CREATE EXTENSION postgis_topology; -- optionalStep 2 — Load OSM data for a city
Download a small Geofabrik extract (your home city or a capital) from download.geofabrik.de. We'll use Copenhagen (denmark-latest.osm.pbf) as the example.
1# Convert and load with osm2pgsql (install it first)
2osm2pgsql -d gisdb -U postgres -H localhost \
3 --hstore --slim denmark-latest.osm.pbfYou now have tables planet_osm_point, planet_osm_line, planet_osm_polygon, planet_osm_roads.
Create indexes if they weren't auto-created:
1CREATE INDEX IF NOT EXISTS idx_osm_point_geom ON planet_osm_point USING GIST (way);
2CREATE INDEX IF NOT EXISTS idx_osm_line_geom ON planet_osm_line USING GIST (way);
3CREATE INDEX IF NOT EXISTS idx_osm_poly_geom ON planet_osm_polygon USING GIST (way);Step 3 — Your five analyst questions
Q1 — How many cafés are in the city?
1SELECT COUNT(*) AS n_cafes
2FROM planet_osm_point
3WHERE amenity = 'cafe';Q2 — Which 5 neighbourhoods have the most cafés?
1SELECT p.name AS neighborhood,
2 COUNT(c.osm_id) AS n_cafes
3FROM planet_osm_polygon p
4LEFT JOIN planet_osm_point c
5 ON ST_Within(c.way, p.way)
6 AND c.amenity = 'cafe'
7WHERE p.boundary = 'administrative'
8 AND p.admin_level = '10'
9GROUP BY p.name
10ORDER BY n_cafes DESC
11LIMIT 5;Q3 — How many bus stops are within 300 m of every school?
1SELECT s.osm_id,
2 s.name AS school,
3 COUNT(b.osm_id) AS bus_stops_nearby
4FROM planet_osm_point s
5LEFT JOIN planet_osm_point b
6 ON ST_DWithin(s.way, b.way, 300)
7 AND b.highway = 'bus_stop'
8WHERE s.amenity = 'school'
9GROUP BY s.osm_id, s.name
10ORDER BY bus_stops_nearby DESC;Q4 — What fraction of major roads is within 100 m of a hospital?
1WITH hospital_buffer AS (
2 SELECT ST_Union(ST_Buffer(way, 100)) AS geom
3 FROM planet_osm_point
4 WHERE amenity = 'hospital'
5),
6major_roads AS (
7 SELECT way, ST_Length(way) AS len
8 FROM planet_osm_roads
9 WHERE highway IN ('motorway','trunk','primary','secondary')
10)
11SELECT
12 SUM(ST_Length(ST_Intersection(m.way, h.geom))) /
13 NULLIF(SUM(m.len), 0) AS fraction
14FROM major_roads m
15CROSS JOIN hospital_buffer h;Q5 — The 3 nearest parks to each school
1SELECT s.osm_id AS school_id,
2 s.name AS school_name,
3 p.osm_id AS park_id,
4 p.name AS park_name,
5 ROUND(p.metres::numeric, 1) AS metres
6FROM planet_osm_point s
7CROSS JOIN LATERAL (
8 SELECT osm_id, name,
9 ST_Distance(s.way, pp.way) AS metres,
10 pp.way
11 FROM planet_osm_polygon pp
12 WHERE leisure = 'park'
13 ORDER BY s.way <-> pp.way
14 LIMIT 3
15) p
16WHERE s.amenity = 'school'
17ORDER BY s.osm_id, metres;Step 4 — Export one result to GeoPackage
1ogr2ogr -f GPKG cafes_per_nbh.gpkg \
2 PG:"host=localhost dbname=gisdb user=postgres password=gis123" \
3 -sql "
4 SELECT p.name AS neighborhood,
5 COUNT(c.osm_id) AS n_cafes,
6 p.way AS geom
7 FROM planet_osm_polygon p
8 LEFT JOIN planet_osm_point c
9 ON ST_Within(c.way, p.way)
10 AND c.amenity = 'cafe'
11 WHERE p.boundary = 'administrative'
12 AND p.admin_level = '10'
13 GROUP BY p.osm_id, p.name, p.way
14 "Open the GeoPackage in QGIS and style by n_cafes — instant choropleth. If you want a quick browser-based review, export the result to GeoJSON and upload it to Atlas; style by n_cafes, add a popup for the neighbourhood name and count, then share the map with a teammate.
Step 5 — Reflection
Record in your notes:
- Count and query time for each of the 5 questions.
- What did you change about query structure to get acceptable performance?
- Which spatial indexes existed by default; which did you add?
Troubleshooting
osm2pgsqlfails — lower the cache:--cache 1000 --number-processes 2. Or use smaller Geofabrik extracts.- Slow queries — check
EXPLAIN ANALYZE. Missing index is the usual culprit. - CRS warnings —
planet_osm_*tables use EPSG:3857 (Web Mercator) by default; cast to geography for distance:ST_Distance(way::geography, ...)or reproject before measurement. permission denied— Docker volume permissions; trysudoor a named volume.
Self-check exercises
1. Q3 performs slowly. Before adding anything, what single query would tell you why?
EXPLAIN ANALYZE <query>; — it reveals whether the planner is using the GiST index. Look for "Index Scan" on the geometry predicate vs "Seq Scan". A seq scan suggests a missing index, a stale ANALYZE, or a non-index-eligible predicate.
2. Why wrap ST_Buffer in a CTE (or subquery) rather than nest it inline in the main predicate?
Nesting ST_Buffer inside a predicate like ST_Intersects(a.way, ST_Buffer(b.way, 100)) prevents the planner from using b's spatial index — it can't know the buffer's bounding box until it computes it. A CTE materialises the buffered geometry once, so its bounding box is available for index scans.
3. Your data is in EPSG:3857 Web Mercator. A distance measurement returns 500 somethings. What are they, and what should you do?
500 Web Mercator metres, which near the equator equals ~500 m but near 60° latitude equals ~250 true metres. Either reproject to a local projected CRS (UTM) or cast to ::geography for great-circle metres. For one-off queries, ST_Distance(way::geography, way2::geography) is the easiest fix.
Summary
- PostGIS in Docker takes minutes to set up; it replaces much of a commercial GIS stack.
osm2pgsql+ a Geofabrik extract gives you real, queryable spatial data.- The five analyst questions demonstrate patterns — points-in-polygon, proximity, nearest-k, areal overlap — that generalise to almost every real project.
Further reading
- postgis.net/workshops/postgis-intro/ — the authoritative free tutorial.
- osm2pgsql documentation.
EXPLAIN ANALYZEvisualisers (depesz.com).planet_osm_*schema reference.
Module 8: Attribute Data & Databases
Answer these quick multiple-choice questions to check your understanding before moving on.