CoursesGIS Basics — A Complete Introduction8.6 Lab — PostGIS Queries End to End
Module 8: Attribute Data & DatabasesHands-on Lab

8.6 Lab — PostGIS Queries End to End

Spin up PostGIS in Docker, load real data, and answer analyst-style questions in SQL.

Lesson 43 of 100·55 min read

Key takeaways

  • You can stand up a working PostGIS environment in under ten minutes.
  • Even analyst-style questions become one-liners once the schema is in place.
  • The same SQL patterns scale from 10 features to 10 million with proper indexing.

Introduction

This lab puts Module 8's material into action. You'll run PostGIS locally, load OpenStreetMap data for a city, and answer five realistic analyst questions entirely in SQL.

Prerequisites

  • Docker installed (or a local PostGIS install).
  • ~10 GB free disk space.
  • psql command-line client (comes with PostgreSQL / Postgres.app / brew install libpq).
  • About an hour.

Step 1 — Spin up PostGIS

Shell
1docker run --name gisdb \
2  -e POSTGRES_PASSWORD=gis123 \
3  -p 5432:5432 \
4  -v gis_data:/var/lib/postgresql/data \
5  -d postgis/postgis:16-3.4
6[object Object]
7

Inside psql:

SQL
1CREATE DATABASE gisdb;
2\c gisdb
3CREATE EXTENSION postgis;
4CREATE EXTENSION postgis_topology;  -- optional

Step 2 — Load OSM data for a city

Download a small Geofabrik extract (your home city or a capital) from download.geofabrik.de. We'll use Copenhagen (denmark-latest.osm.pbf) as the example.

Shell
1# Convert and load with osm2pgsql (install it first)
2osm2pgsql -d gisdb -U postgres -H localhost \
3  --hstore --slim denmark-latest.osm.pbf

You now have tables planet_osm_point, planet_osm_line, planet_osm_polygon, planet_osm_roads.

Create indexes if they weren't auto-created:

SQL
1CREATE INDEX IF NOT EXISTS idx_osm_point_geom  ON planet_osm_point  USING GIST (way);
2CREATE INDEX IF NOT EXISTS idx_osm_line_geom   ON planet_osm_line   USING GIST (way);
3CREATE INDEX IF NOT EXISTS idx_osm_poly_geom   ON planet_osm_polygon USING GIST (way);

Step 3 — Your five analyst questions

Q1 — How many cafés are in the city?

SQL
1SELECT COUNT(*) AS n_cafes
2FROM planet_osm_point
3WHERE amenity = 'cafe';

Q2 — Which 5 neighbourhoods have the most cafés?

SQL
1SELECT p.name AS neighborhood,
2       COUNT(c.osm_id) AS n_cafes
3FROM planet_osm_polygon p
4LEFT JOIN planet_osm_point c
5  ON ST_Within(c.way, p.way)
6 AND c.amenity = 'cafe'
7WHERE p.boundary = 'administrative'
8  AND p.admin_level = '10'
9GROUP BY p.name
10ORDER BY n_cafes DESC
11LIMIT 5;

Q3 — How many bus stops are within 300 m of every school?

SQL
1SELECT s.osm_id,
2       s.name AS school,
3       COUNT(b.osm_id) AS bus_stops_nearby
4FROM planet_osm_point s
5LEFT JOIN planet_osm_point b
6  ON ST_DWithin(s.way, b.way, 300)
7 AND b.highway = 'bus_stop'
8WHERE s.amenity = 'school'
9GROUP BY s.osm_id, s.name
10ORDER BY bus_stops_nearby DESC;

Q4 — What fraction of major roads is within 100 m of a hospital?

SQL
1WITH hospital_buffer AS (
2  SELECT ST_Union(ST_Buffer(way, 100)) AS geom
3  FROM planet_osm_point
4  WHERE amenity = 'hospital'
5),
6major_roads AS (
7  SELECT way, ST_Length(way) AS len
8  FROM planet_osm_roads
9  WHERE highway IN ('motorway','trunk','primary','secondary')
10)
11SELECT
12  SUM(ST_Length(ST_Intersection(m.way, h.geom))) /
13  NULLIF(SUM(m.len), 0) AS fraction
14FROM major_roads m
15CROSS JOIN hospital_buffer h;

Q5 — The 3 nearest parks to each school

SQL
1SELECT s.osm_id AS school_id,
2       s.name   AS school_name,
3       p.osm_id AS park_id,
4       p.name   AS park_name,
5       ROUND(p.metres::numeric, 1) AS metres
6FROM planet_osm_point s
7CROSS JOIN LATERAL (
8  SELECT osm_id, name,
9         ST_Distance(s.way, pp.way) AS metres,
10         pp.way
11  FROM planet_osm_polygon pp
12  WHERE leisure = 'park'
13  ORDER BY s.way <-> pp.way
14  LIMIT 3
15) p
16WHERE s.amenity = 'school'
17ORDER BY s.osm_id, metres;

Step 4 — Export one result to GeoPackage

Shell
1ogr2ogr -f GPKG cafes_per_nbh.gpkg \
2  PG:"host=localhost dbname=gisdb user=postgres password=gis123" \
3  -sql "
4    SELECT p.name AS neighborhood,
5           COUNT(c.osm_id) AS n_cafes,
6           p.way AS geom
7    FROM planet_osm_polygon p
8    LEFT JOIN planet_osm_point c
9      ON ST_Within(c.way, p.way)
10     AND c.amenity = 'cafe'
11    WHERE p.boundary = 'administrative'
12      AND p.admin_level = '10'
13    GROUP BY p.osm_id, p.name, p.way
14  "

Open the GeoPackage in QGIS and style by n_cafes — instant choropleth. If you want a quick browser-based review, export the result to GeoJSON and upload it to Atlas; style by n_cafes, add a popup for the neighbourhood name and count, then share the map with a teammate.

Step 5 — Reflection

Record in your notes:

  • Count and query time for each of the 5 questions.
  • What did you change about query structure to get acceptable performance?
  • Which spatial indexes existed by default; which did you add?

Troubleshooting

  • osm2pgsql fails — lower the cache: --cache 1000 --number-processes 2. Or use smaller Geofabrik extracts.
  • Slow queries — check EXPLAIN ANALYZE. Missing index is the usual culprit.
  • CRS warningsplanet_osm_* tables use EPSG:3857 (Web Mercator) by default; cast to geography for distance: ST_Distance(way::geography, ...) or reproject before measurement.
  • permission denied — Docker volume permissions; try sudo or a named volume.

Self-check exercises

1. Q3 performs slowly. Before adding anything, what single query would tell you why?

EXPLAIN ANALYZE <query>; — it reveals whether the planner is using the GiST index. Look for "Index Scan" on the geometry predicate vs "Seq Scan". A seq scan suggests a missing index, a stale ANALYZE, or a non-index-eligible predicate.

2. Why wrap ST_Buffer in a CTE (or subquery) rather than nest it inline in the main predicate?

Nesting ST_Buffer inside a predicate like ST_Intersects(a.way, ST_Buffer(b.way, 100)) prevents the planner from using b's spatial index — it can't know the buffer's bounding box until it computes it. A CTE materialises the buffered geometry once, so its bounding box is available for index scans.

3. Your data is in EPSG:3857 Web Mercator. A distance measurement returns 500 somethings. What are they, and what should you do?

500 Web Mercator metres, which near the equator equals ~500 m but near 60° latitude equals ~250 true metres. Either reproject to a local projected CRS (UTM) or cast to ::geography for great-circle metres. For one-off queries, ST_Distance(way::geography, way2::geography) is the easiest fix.

Summary

  • PostGIS in Docker takes minutes to set up; it replaces much of a commercial GIS stack.
  • osm2pgsql + a Geofabrik extract gives you real, queryable spatial data.
  • The five analyst questions demonstrate patterns — points-in-polygon, proximity, nearest-k, areal overlap — that generalise to almost every real project.

Further reading

  • postgis.net/workshops/postgis-intro/ — the authoritative free tutorial.
  • osm2pgsql documentation.
  • EXPLAIN ANALYZE visualisers (depesz.com).
  • planet_osm_* schema reference.
Module test

Module 8: Attribute Data & Databases

Answer these quick multiple-choice questions to check your understanding before moving on.

1. What does an attribute table store?
2. Why are spatial indexes important?
3. A spatial join is used to do what?