CoursesGIS Basics — A Complete Introduction8.4 PostGIS — A Working Introduction
Module 8: Attribute Data & Databases

8.4 PostGIS — A Working Introduction

Installing, configuring, and running PostGIS for real spatial analysis.

Lesson 41 of 100·22 min read

Key takeaways

  • PostGIS turns PostgreSQL into an industrial-strength spatial database.
  • Geometry vs geography types, SRIDs, and indexes are the foundations of effective use.
  • Most spatial analyses you'll ever run can be expressed in PostGIS SQL.

Introduction

Module 8.2 explained what databases are and why PostGIS matters. This lesson gets practical — installing PostgreSQL + PostGIS, loading data, and running the queries that solve real problems.

Install

Docker (easiest)

Shell
1docker run --name gisdb \
2  -e POSTGRES_PASSWORD=gis123 \
3  -p 5432:5432 \
4  -d postgis/postgis:16-3.4

macOS (Postgres.app)

Download Postgres.app — includes PostGIS. Run CREATE EXTENSION postgis; on a new database.

Linux

Shell
1sudo apt install postgresql postgresql-contrib postgis
2sudo -u postgres psql -c 'CREATE DATABASE gisdb;'
3sudo -u postgres psql -d gisdb -c 'CREATE EXTENSION postgis;'

Cloud

Pick AWS RDS, Google Cloud SQL, or Supabase — all have PostGIS-enabled PostgreSQL.

Your first spatial table

SQL
1CREATE TABLE stations (
2  id SERIAL PRIMARY KEY,
3  name TEXT NOT NULL,
4  geom GEOMETRY(Point, 4326) NOT NULL
5);

The GEOMETRY(Point, 4326) type constraint does two jobs: it restricts this column to 2D points in SRID 4326 (WGS84), and it tells PostGIS to validate inputs.

Loading data

From a shapefile

Shell
shp2pgsql -s 4326 parks.shp parks | psql -d gisdb

Or with ogr2ogr:

Shell
1ogr2ogr -f PostgreSQL \
2  PG:"host=localhost user=postgres dbname=gisdb password=gis123" \
3  parks.shp -nln parks -lco GEOMETRY_NAME=geom

From GeoPackage

Shell
ogr2ogr -f PostgreSQL PG:"..." data.gpkg -nln schema.table_name

From Python

Python
1import geopandas as gpd
2from sqlalchemy import create_engine

Geometry vs Geography

PostGIS has two types:

  • Geometry — flat cartesian coordinates. Fast; assumes a projected CRS or that you'll manage curvature yourself.
  • Geography — spherical coordinates; all operations use great-circle math on WGS84.

Guidelines:

  • Local / city / country analysis in a projected CRS → Geometry.
  • Global scale analysis in lat/lon → Geography (or reproject data and use Geometry).
  • Mixed casting is OK: ST_Distance(geom::geography, geom2::geography) for one-off metric distance.

Core spatial functions

FunctionWhat it does
ST_Buffer(geom, d)Buffer by distance d.
ST_Union(geom)Dissolve multiple geometries.
ST_Intersects(a, b)Any point in common.
ST_Contains(a, b)a contains b.
ST_Within(a, b)a within b.
ST_DWithin(a, b, d)a within distance d of b.
ST_Distance(a, b)Distance between a and b.
ST_Intersection(a, b)Intersection geometry.
ST_Difference(a, b)Parts of a not in b.
ST_Centroid(geom)Centroid.
ST_Area(geom) / ST_Length(geom)Measurements.
ST_Transform(geom, srid)Reproject.
ST_Simplify(geom, tol)Simplify.
ST_MakeValid(geom)Fix invalid geometries.

Indexing

Always index your geometry columns:

SQL
CREATE INDEX idx_table_geom ON table USING GIST (geom);

Run VACUUM ANALYZE table; after bulk loads to rebuild statistics.

For frequent attribute filters, add B-tree indexes on the columns.

Querying efficiently

Tips:

  • Use ST_DWithin instead of ST_Distance < d; ST_DWithin is indexable, ST_Distance is not.
  • Use <-> (distance-ordered KNN) for nearest-neighbour ordering.
  • Pre-reproject data once; don't call ST_Transform inside a loop.
  • Materialise intermediate results with CTEs or temp tables for repeated sub-queries.

Topology extension

CREATE EXTENSION postgis_topology; enables a topology engine — shared edges, validated polygons, topology-aware editing. Heavier than needed for most users but powerful when consistency matters.

Raster extension

CREATE EXTENSION postgis_raster; adds raster-in-database capabilities. Can store DEMs, imagery, and run ST_Value, ST_Clip, etc. Performance is acceptable for small rasters; for large imagery, prefer filesystem COGs with references in the database.

Useful patterns

Spatial join with aggregation

SQL
1SELECT c.code, c.name,
2       COUNT(p.id) AS n_permits,
3       SUM(p.value) AS total_value
4FROM counties c
5LEFT JOIN building_permits p ON ST_Within(p.geom, c.geom)
6GROUP BY c.code, c.name;

Buffers and overlay

SQL
1WITH river_buffer AS (
2  SELECT ST_Buffer(geom::geography, 200)::geometry AS geom
3  FROM rivers
4)
5SELECT b.id
6FROM buildings b
7JOIN river_buffer r ON ST_Intersects(b.geom, r.geom);

Nearest N neighbours

SQL
1SELECT p.id, h.name, ST_Distance(p.geom::geography, h.geom::geography) AS m
2FROM patients p
3CROSS JOIN LATERAL (
4  SELECT name, geom
5  FROM hospitals
6  ORDER BY p.geom <-> geom
7  LIMIT 3
8) h;

Exporting

Shell
1ogr2ogr -f GPKG out.gpkg \
2  PG:"host=localhost dbname=gisdb user=postgres password=gis123" \
3  -sql "SELECT * FROM parks WHERE area_ha > 10"

One command dumps a query result to a GeoPackage.

Self-check exercises

1. When should you use Geography vs Geometry?

Use Geography for global-scale queries in lat/lon where you need metre-accurate distances without reprojecting. Use Geometry for local or projected work where speed matters and you've picked an appropriate CRS. Mixed casting (geom::geography) is fine for one-off metric distance calculations on otherwise Geometry data.

2. Your query uses `WHERE ST_Distance(a.geom, b.geom) < 500`. Why's this slow?

ST_Distance computes distance for every row pair — the optimizer can't use a spatial index to prune candidates. Rewrite with WHERE ST_DWithin(a.geom, b.geom, 500), which is index-optimised: it uses the bounding-box spatial index to find candidates first, then does the precise distance check only on those.

3. After bulk-loading 10 M features, queries are sluggish. What two maintenance commands should you run?

CREATE INDEX on the geometry column (GiST) and any attribute columns you filter on, then VACUUM ANALYZE table to rebuild statistics so the planner chooses indexes correctly. These are frequently overlooked steps that give order-of-magnitude speedups.

Summary

  • PostGIS: PostgreSQL + spatial types, functions, indexes.
  • Geometry for local/projected work; Geography for global lat/lon.
  • Always create GiST indexes; use ST_DWithin for distance filters.
  • One SQL ecosystem covers most operations you'll meet in the rest of this course.

Further reading

  • PostGIS Manual — authoritative reference.
  • PostGIS in Action, 3rd edition.
  • Boundless Open Geo Stack materials.
  • postgis.net/workshops — free tutorials.