18.1 Python for GIS — GeoPandas and Shapely
The de facto standard stack for scripting vector GIS in Python.
Key takeaways
- GeoPandas extends pandas with a geometry column, bringing GIS into the Python data stack.
- Shapely handles the underlying geometry operations; pyproj handles projections.
- A few tens of lines of Python can replicate most desktop GIS workflows.
Introduction
Python has become the lingua franca of data science; with GeoPandas and friends, it's also the most flexible GIS environment short of a full commercial stack. This lesson introduces the core libraries and shows the patterns you'll use daily.
The stack
- Shapely — 2D geometry operations (buffers, intersections, distances).
- Fiona / pyogrio — read and write vector files.
- GeoPandas — pandas DataFrame + Shapely geometry column + spatial operations.
- pyproj — coordinate transformations.
- Rtree — spatial indexing.
- Rasterio / Xarray — rasters (Module 18.2).
- Folium / contextily — quick visualisation.
Install:
pip install geopandas rasterio shapely pyproj contextilyA first GeoPandas session
1import geopandas as gpd
2[object Object]
3[object Object]
4[object Object]
5[object Object]
6[object Object]
7[object Object]
8[object Object]
9[object Object]
10Ten lines, most of what you need.
Geometry operations
1# Distance
2distance = gdf.geometry.distance(other_point)
3[object Object]
4[object Object]
5[object Object]
6[object Object]
7[object Object]
8[object Object]
9[object Object]
10Working with rasters — quick preview
1import rasterio
2from rasterio.plot import showModule 18.2 goes deeper on rasters.
Reading/writing formats
1gpd.read_file('data.shp')
2gpd.read_file('data.geojson')
3gpd.read_file('data.gpkg', layer='parks')
4gpd.read_parquet('data.parquet')pyogrio (a faster alternative to Fiona) ships with modern GeoPandas.
Coordinate transformations
1gdf_utm = gdf.to_crs(epsg=26910)
2gdf_geographic = gdf.to_crs('EPSG:4326')Spatial indexing
GeoPandas uses Rtree under the hood:
1gdf.sindex.query(geom, predicate='intersects')
2# Returns indices of features intersecting geom; fast on large dataPlotting and context
import contextily as ctxContextily adds base maps behind your plots — quick QA.
A full mini-workflow
1import geopandas as gpd
2[object Object]
3[object Object]
4[object Object]
5[object Object]
6[object Object]
7[object Object]
8[object Object]
9[object Object]
10[object Object]
11Twelve lines of code replace what used to be a multi-step GUI workflow.
Notebook workflow
Jupyter notebooks are the ideal medium:
- Inline map display.
- Reproducibility.
- Easy iteration.
- Share-friendly.
Use ipyleaflet or Folium for interactive maps inside the notebook.
Testing spatial code
Unit tests with pytest on simple polygons:
1from shapely.geometry import Polygon
2[object Object]
3Treating your spatial workflows as testable code eliminates most classes of silent bugs.
Self-check exercises
1. Why does GeoPandas depend on both Shapely and pyproj?
Shapely provides 2D geometric operations (intersections, buffers, etc.) on flat coordinates. pyproj handles CRS transformations — converting between geographic and projected systems. GeoPandas orchestrates both: it reprojects via pyproj when you call .to_crs(), then uses Shapely for all subsequent geometry math. The clean separation is a virtue.
2. Which operation would you NOT run in EPSG:4326?
Area / distance / buffer operations. In 4326 the unit is degrees, which are angular and latitude-dependent; results are meaningless (or in degrees squared for area). Always reproject to a projected metric CRS (UTM, state plane) before area / distance / buffer. Or use the geography type in PostGIS for geodesic calculations.
3. Your sjoin returns an unexpected number of rows. What should you check?
(1) Both layers in the same CRS. (2) how parameter — inner drops non-matches, left preserves them. (3) predicate — within vs intersects differ subtly (boundary cases). (4) Duplicates in the right table cause one-to-many joins. Print a few rows from each side and trace the expected match manually.
Summary
- GeoPandas + Shapely + pyproj form the Python GIS core.
- Read, filter, reproject, spatial join, dissolve — all one-liners.
- Notebooks + contextily for exploration; pytest for production code.
- Replaces most desktop GIS workflows with reproducible scripts.
Further reading
- GeoPandas documentation (geopandas.org).
- Shapely documentation (shapely.readthedocs.io).
- Geographic Data Science with Python (Sergio Rey et al.) — open textbook.
- PyData / FOSS4G conference talks on GeoPandas.