8.4 PostGIS — A Working Introduction
Installing, configuring, and running PostGIS for real spatial analysis.
Key takeaways
- PostGIS turns PostgreSQL into an industrial-strength spatial database.
- Geometry vs geography types, SRIDs, and indexes are the foundations of effective use.
- Most spatial analyses you'll ever run can be expressed in PostGIS SQL.
Introduction
Module 8.2 explained what databases are and why PostGIS matters. This lesson gets practical — installing PostgreSQL + PostGIS, loading data, and running the queries that solve real problems.
Install
Docker (easiest)
1docker run --name gisdb \
2 -e POSTGRES_PASSWORD=gis123 \
3 -p 5432:5432 \
4 -d postgis/postgis:16-3.4macOS (Postgres.app)
Download Postgres.app — includes PostGIS. Run CREATE EXTENSION postgis; on a new database.
Linux
1sudo apt install postgresql postgresql-contrib postgis
2sudo -u postgres psql -c 'CREATE DATABASE gisdb;'
3sudo -u postgres psql -d gisdb -c 'CREATE EXTENSION postgis;'Cloud
Pick AWS RDS, Google Cloud SQL, or Supabase — all have PostGIS-enabled PostgreSQL.
Your first spatial table
1CREATE TABLE stations (
2 id SERIAL PRIMARY KEY,
3 name TEXT NOT NULL,
4 geom GEOMETRY(Point, 4326) NOT NULL
5);The GEOMETRY(Point, 4326) type constraint does two jobs: it restricts this column to 2D points in SRID 4326 (WGS84), and it tells PostGIS to validate inputs.
Loading data
From a shapefile
shp2pgsql -s 4326 parks.shp parks | psql -d gisdbOr with ogr2ogr:
1ogr2ogr -f PostgreSQL \
2 PG:"host=localhost user=postgres dbname=gisdb password=gis123" \
3 parks.shp -nln parks -lco GEOMETRY_NAME=geomFrom GeoPackage
ogr2ogr -f PostgreSQL PG:"..." data.gpkg -nln schema.table_nameFrom Python
1import geopandas as gpd
2from sqlalchemy import create_engineGeometry vs Geography
PostGIS has two types:
- Geometry — flat cartesian coordinates. Fast; assumes a projected CRS or that you'll manage curvature yourself.
- Geography — spherical coordinates; all operations use great-circle math on WGS84.
Guidelines:
- Local / city / country analysis in a projected CRS →
Geometry. - Global scale analysis in lat/lon →
Geography(or reproject data and useGeometry). - Mixed casting is OK:
ST_Distance(geom::geography, geom2::geography)for one-off metric distance.
Core spatial functions
| Function | What it does |
|---|---|
ST_Buffer(geom, d) | Buffer by distance d. |
ST_Union(geom) | Dissolve multiple geometries. |
ST_Intersects(a, b) | Any point in common. |
ST_Contains(a, b) | a contains b. |
ST_Within(a, b) | a within b. |
ST_DWithin(a, b, d) | a within distance d of b. |
ST_Distance(a, b) | Distance between a and b. |
ST_Intersection(a, b) | Intersection geometry. |
ST_Difference(a, b) | Parts of a not in b. |
ST_Centroid(geom) | Centroid. |
ST_Area(geom) / ST_Length(geom) | Measurements. |
ST_Transform(geom, srid) | Reproject. |
ST_Simplify(geom, tol) | Simplify. |
ST_MakeValid(geom) | Fix invalid geometries. |
Indexing
Always index your geometry columns:
CREATE INDEX idx_table_geom ON table USING GIST (geom);Run VACUUM ANALYZE table; after bulk loads to rebuild statistics.
For frequent attribute filters, add B-tree indexes on the columns.
Querying efficiently
Tips:
- Use
ST_DWithininstead ofST_Distance < d;ST_DWithinis indexable,ST_Distanceis not. - Use
<->(distance-ordered KNN) for nearest-neighbour ordering. - Pre-reproject data once; don't call
ST_Transforminside a loop. - Materialise intermediate results with CTEs or temp tables for repeated sub-queries.
Topology extension
CREATE EXTENSION postgis_topology; enables a topology engine — shared edges, validated polygons, topology-aware editing. Heavier than needed for most users but powerful when consistency matters.
Raster extension
CREATE EXTENSION postgis_raster; adds raster-in-database capabilities. Can store DEMs, imagery, and run ST_Value, ST_Clip, etc. Performance is acceptable for small rasters; for large imagery, prefer filesystem COGs with references in the database.
Useful patterns
Spatial join with aggregation
1SELECT c.code, c.name,
2 COUNT(p.id) AS n_permits,
3 SUM(p.value) AS total_value
4FROM counties c
5LEFT JOIN building_permits p ON ST_Within(p.geom, c.geom)
6GROUP BY c.code, c.name;Buffers and overlay
1WITH river_buffer AS (
2 SELECT ST_Buffer(geom::geography, 200)::geometry AS geom
3 FROM rivers
4)
5SELECT b.id
6FROM buildings b
7JOIN river_buffer r ON ST_Intersects(b.geom, r.geom);Nearest N neighbours
1SELECT p.id, h.name, ST_Distance(p.geom::geography, h.geom::geography) AS m
2FROM patients p
3CROSS JOIN LATERAL (
4 SELECT name, geom
5 FROM hospitals
6 ORDER BY p.geom <-> geom
7 LIMIT 3
8) h;Exporting
1ogr2ogr -f GPKG out.gpkg \
2 PG:"host=localhost dbname=gisdb user=postgres password=gis123" \
3 -sql "SELECT * FROM parks WHERE area_ha > 10"One command dumps a query result to a GeoPackage.
Self-check exercises
1. When should you use Geography vs Geometry?
Use Geography for global-scale queries in lat/lon where you need metre-accurate distances without reprojecting. Use Geometry for local or projected work where speed matters and you've picked an appropriate CRS. Mixed casting (geom::geography) is fine for one-off metric distance calculations on otherwise Geometry data.
2. Your query uses `WHERE ST_Distance(a.geom, b.geom) < 500`. Why's this slow?
ST_Distance computes distance for every row pair — the optimizer can't use a spatial index to prune candidates. Rewrite with WHERE ST_DWithin(a.geom, b.geom, 500), which is index-optimised: it uses the bounding-box spatial index to find candidates first, then does the precise distance check only on those.
3. After bulk-loading 10 M features, queries are sluggish. What two maintenance commands should you run?
CREATE INDEX on the geometry column (GiST) and any attribute columns you filter on, then VACUUM ANALYZE table to rebuild statistics so the planner chooses indexes correctly. These are frequently overlooked steps that give order-of-magnitude speedups.
Summary
- PostGIS: PostgreSQL + spatial types, functions, indexes.
- Geometry for local/projected work; Geography for global lat/lon.
- Always create GiST indexes; use ST_DWithin for distance filters.
- One SQL ecosystem covers most operations you'll meet in the rest of this course.
Further reading
- PostGIS Manual — authoritative reference.
- PostGIS in Action, 3rd edition.
- Boundless Open Geo Stack materials.
- postgis.net/workshops — free tutorials.