CoursesGIS Basics — A Complete Introduction18.1 Python for GIS — GeoPandas and Shapely
Module 18: Automation & Programming

18.1 Python for GIS — GeoPandas and Shapely

The de facto standard stack for scripting vector GIS in Python.

Lesson 87 of 100·20 min read

Key takeaways

  • GeoPandas extends pandas with a geometry column, bringing GIS into the Python data stack.
  • Shapely handles the underlying geometry operations; pyproj handles projections.
  • A few tens of lines of Python can replicate most desktop GIS workflows.

Introduction

Python has become the lingua franca of data science; with GeoPandas and friends, it's also the most flexible GIS environment short of a full commercial stack. This lesson introduces the core libraries and shows the patterns you'll use daily.

The stack

  • Shapely — 2D geometry operations (buffers, intersections, distances).
  • Fiona / pyogrio — read and write vector files.
  • GeoPandas — pandas DataFrame + Shapely geometry column + spatial operations.
  • pyproj — coordinate transformations.
  • Rtree — spatial indexing.
  • Rasterio / Xarray — rasters (Module 18.2).
  • Folium / contextily — quick visualisation.

Install:

Shell
pip install geopandas rasterio shapely pyproj contextily

A first GeoPandas session

Python
1import geopandas as gpd
2[object Object]
3[object Object]
4[object Object]
5[object Object]
6[object Object]
7[object Object]
8[object Object]
9[object Object]
10

Ten lines, most of what you need.

Geometry operations

Python
1# Distance
2distance = gdf.geometry.distance(other_point)
3[object Object]
4[object Object]
5[object Object]
6[object Object]
7[object Object]
8[object Object]
9[object Object]
10

Working with rasters — quick preview

Python
1import rasterio
2from rasterio.plot import show

Module 18.2 goes deeper on rasters.

Reading/writing formats

Python
1gpd.read_file('data.shp')
2gpd.read_file('data.geojson')
3gpd.read_file('data.gpkg', layer='parks')
4gpd.read_parquet('data.parquet')

pyogrio (a faster alternative to Fiona) ships with modern GeoPandas.

Coordinate transformations

Python
1gdf_utm = gdf.to_crs(epsg=26910)
2gdf_geographic = gdf.to_crs('EPSG:4326')

Spatial indexing

GeoPandas uses Rtree under the hood:

Python
1gdf.sindex.query(geom, predicate='intersects')
2# Returns indices of features intersecting geom; fast on large data

Plotting and context

Python
import contextily as ctx

Contextily adds base maps behind your plots — quick QA.

A full mini-workflow

Python
1import geopandas as gpd
2[object Object]
3[object Object]
4[object Object]
5[object Object]
6[object Object]
7[object Object]
8[object Object]
9[object Object]
10[object Object]
11

Twelve lines of code replace what used to be a multi-step GUI workflow.

Notebook workflow

Jupyter notebooks are the ideal medium:

  • Inline map display.
  • Reproducibility.
  • Easy iteration.
  • Share-friendly.

Use ipyleaflet or Folium for interactive maps inside the notebook.

Testing spatial code

Unit tests with pytest on simple polygons:

Python
1from shapely.geometry import Polygon
2[object Object]
3

Treating your spatial workflows as testable code eliminates most classes of silent bugs.

Self-check exercises

1. Why does GeoPandas depend on both Shapely and pyproj?

Shapely provides 2D geometric operations (intersections, buffers, etc.) on flat coordinates. pyproj handles CRS transformations — converting between geographic and projected systems. GeoPandas orchestrates both: it reprojects via pyproj when you call .to_crs(), then uses Shapely for all subsequent geometry math. The clean separation is a virtue.

2. Which operation would you NOT run in EPSG:4326?

Area / distance / buffer operations. In 4326 the unit is degrees, which are angular and latitude-dependent; results are meaningless (or in degrees squared for area). Always reproject to a projected metric CRS (UTM, state plane) before area / distance / buffer. Or use the geography type in PostGIS for geodesic calculations.

3. Your sjoin returns an unexpected number of rows. What should you check?

(1) Both layers in the same CRS. (2) how parameter — inner drops non-matches, left preserves them. (3) predicatewithin vs intersects differ subtly (boundary cases). (4) Duplicates in the right table cause one-to-many joins. Print a few rows from each side and trace the expected match manually.

Summary

  • GeoPandas + Shapely + pyproj form the Python GIS core.
  • Read, filter, reproject, spatial join, dissolve — all one-liners.
  • Notebooks + contextily for exploration; pytest for production code.
  • Replaces most desktop GIS workflows with reproducible scripts.

Further reading

  • GeoPandas documentation (geopandas.org).
  • Shapely documentation (shapely.readthedocs.io).
  • Geographic Data Science with Python (Sergio Rey et al.) — open textbook.
  • PyData / FOSS4G conference talks on GeoPandas.