CoursesGIS Basics — A Complete Introduction3.1 The Vector Data Model
Module 3: Spatial Data Models

3.1 The Vector Data Model

Points, lines, polygons, and multi-geometries — how vector GIS represents the world.

Lesson 10 of 100·22 min read

Key takeaways

  • The vector data model represents features as points, lines, and polygons with associated attributes.
  • Geometries can be simple or composite (multi-geometries, collections).
  • Well-designed vector data makes topology, queries, and cartography straightforward.

Introduction

There are two great ways to represent the spatial world digitally: vector and raster. Vectors describe discrete features — buildings, roads, lakes — with mathematical geometry. Rasters describe continuous surfaces — elevation, temperature, imagery — with grids of cells. Most GIS analyses combine both, but each has its own rules, strengths, and pitfalls. This lesson is the vector deep dive.

The three atomic geometries

Every vector feature is built from three primitives:

Point

A location defined by an (x, y) pair (or (x, y, z) for 3D). Examples: a traffic light, a sample station, a delivery address.

JSON
{ "type": "Point", "coordinates": [-122.4194, 37.7749] }

LineString

An ordered sequence of points connected by straight segments. Examples: a road centreline, a river reach, a flight path.

JSON
1{
2  "type": "LineString",
3  "coordinates": [[-122.42, 37.77], [-122.41, 37.78], [-122.40, 37.77]]
4}

Polygon

An ordered, closed sequence of points enclosing an area — optionally with holes (interior rings). Examples: a building footprint, an administrative boundary, a lake.

JSON
1{
2  "type": "Polygon",
3  "coordinates": [
4    [[0,0], [10,0], [10,10], [0,10], [0,0]],     // exterior ring (CCW)
5    [[3,3], [3,6], [6,6], [6,3], [3,3]]          // interior hole (CW)
6  ]
7}

The first and last vertex of each ring must be identical (polygons are closed).

Ring orientation

By OGC convention:

  • Exterior rings: counter-clockwise (CCW) in GeoJSON.
  • Interior rings (holes): clockwise (CW).

Software that cares (e.g., mapping libraries doing point-in-polygon tests) checks orientation; a mis-oriented ring can turn your polygon inside-out.

Multi-geometries

Some features consist of disjoint pieces. The Hawaiian Islands are geographically one entity but physically many polygons; the Nile–Blue Nile system forms one conceptual river with many branches.

  • MultiPoint — a set of points (e.g., a population of sampled trees).
  • MultiLineString — a set of disjoint lines.
  • MultiPolygon — a set of polygons (e.g., Indonesia, the Philippines).
  • GeometryCollection — a heterogeneous bag; use sparingly as many operations don't support it.
JSON
1{
2  "type": "MultiPolygon",
3  "coordinates": [
4    [[[0,0],[1,0],[1,1],[0,1],[0,0]]],
5    [[[5,5],[6,5],[6,6],[5,6],[5,5]]]
6  ]
7}

Features and attributes

A feature is a geometry + attributes. Attributes live in a table-like structure tied to the geometry:

parcel_idownerusearea_m2geometry
42J. DoeResidential620POLYGON((...))
43Acme LtdCommercial2 340POLYGON((...))

A FeatureCollection is an ordered set of features (that's what a shapefile, GeoJSON file, or PostGIS table represents).

Coordinate reference systems

Vectors are just coordinates until you know what CRS they're in. Always store the CRS alongside the geometry. GeoJSON assumes EPSG:4326 (WGS84) unless otherwise stated; shapefiles carry a .prj file; PostGIS tracks the SRID per geometry column.

More on CRSs in Module 4.

Validity

A valid polygon:

  • Has closed rings (first == last vertex).
  • Rings do not self-intersect.
  • Inner rings lie entirely inside the outer ring.
  • Inner rings do not overlap each other.

Invalid polygons cause unpredictable behaviour in spatial operations. Use ST_IsValid and ST_MakeValid (PostGIS) or shapely.validation.make_valid (Python) to check and repair.

Precision and simplification

A vertex can be stored as double-precision float (15–17 significant digits) or truncated to the precision your source supports (e.g., 1 cm for a survey-grade GNSS trace). Excess precision is dead weight — it inflates file sizes and can cause "near-equal" geometries to fail equality tests.

Use Douglas–Peucker or Visvalingam to simplify geometries for coarser scales (see 2.2).

Serialisation formats

The same vector feature can be serialised in many formats. You'll meet them across the course:

FormatFlavourNotes
WKTTextWell-Known Text, human-readable, ISO standard.
WKBBinaryCompact binary version of WKT, the PostGIS on-disk format.
GeoJSONText / JSONWeb-friendly, WGS84-only by convention.
ShapefileBinaryLegacy but ubiquitous.
GeoPackageBinary (SQLite)Modern replacement for shapefile.
FlatGeobufBinaryCloud-optimised, streamable.

Module 5 covers formats in detail.

Attributes and schema

The attribute schema is as important as the geometry. Consider a roads dataset with surface, speed_limit, oneway, lanes, maintenance_authority. Every analysis that uses roads depends on the schema — miss an attribute or confuse its type (string vs integer) and your analysis breaks.

Best practices:

  • Use explicit column types.
  • Document values and units (area_m2, not just area).
  • Avoid free-text where enumerated values exist.
  • Use a unique, stable identifier for every feature.

A tiny worked example

Given a feature collection of parks and schools, compute each school's distance to the nearest park. Pseudo-code:

Python
1import geopandas as gpd
2[object Object]
3[object Object]
4

Three lines do the work: read, reproject, nearest-join. The rest of the course is about the operations hidden behind that simplicity.

Self-check exercises

1. Why must the first and last vertex of a polygon ring be identical?

Polygons are defined as closed boundaries. An unclosed ring is ambiguous — software can't tell whether the missing segment is implicit or the geometry is broken. OGC and GeoJSON both mandate explicit closure.

2. What's the difference between a Polygon and a MultiPolygon?

A Polygon is a single area (possibly with holes). A MultiPolygon is a collection of one or more Polygons treated as a single feature (e.g., Indonesia). Many operations accept either; some (e.g., ST_Buffer on a degenerate input) treat them differently.

3. Your analysis fails because of invalid geometries. What two functions can you use to diagnose and fix them?

ST_IsValid (returns boolean + reason via ST_IsValidReason) tells you whether a geometry is valid. ST_MakeValid (or shapely.make_valid) repairs invalid geometries, typically splitting self-intersections or re-orienting rings. Always inspect the result; some "fixes" change the intended shape.

Summary

  • Vector data is points, lines, polygons + multi-variants + attributes.
  • Ring orientation, closure, and validity matter — invalid geometries break downstream ops.
  • Features always live in a CRS; store and communicate it.
  • Serialisation formats vary; the data model underneath is the same.

Further reading

  • OGC Simple Features Access specification.
  • GeoJSON RFC 7946.
  • PostGIS in Action, 3rd edition — thorough treatment of vector operations.
  • Shapely documentation — the Python reference implementation for 2D geometry.