ST_ClusterWithinWin
What is ST_ClusterWithinWin?
ST_ClusterWithinWin is a PostGIS window function that returns a cluster id for each input geometry based on single-linkage clustering by distance. Any two geometries within the supplied distance are in the same cluster, and clusters merge transitively.
ST_ClusterWithinWin(geometry winset geom, float8 distance) OVER () → integerIt is the window-function companion to the aggregate ST_ClusterWithin, returning a per-row cluster id instead of collections.
When would you use ST_ClusterWithinWin?
Use ST_ClusterWithinWin when you need each row in your input to carry a cluster id — for joins, GROUP BY aggregations, or filtering. It is preferred over the aggregate when you want to combine cluster id with other attributes on the same row.
1SELECT id, name,
2 ST_ClusterWithinWin(geom, 100) OVER () AS cluster_id
3FROM stores;FAQs
When should I use the window form vs the aggregate?
Use the window form (ST_ClusterWithinWin) when you need each input row labelled with a cluster id for further per-row operations. Use the aggregate (ST_ClusterWithin) when you want the cluster geometries packaged as arrays of GeometryCollections.
What units is the distance in?
Whatever the input geometry's SRID uses — metres for projected CRSs, degrees for geographic ones. For metre-based thresholds on EPSG:4326 data, reproject first.
How is it different from ST_ClusterDBSCAN?
ST_ClusterWithinWin is equivalent to DBSCAN with minpoints = 1 — every group of connected features, no matter how small, becomes a cluster. DBSCAN's minpoints > 1 lets you flag sparse clusters as noise; use it when you want to reject outliers.
Does it handle very large datasets efficiently?
Yes — PostGIS builds a spatial index for the window partition and uses it to find neighbours within distance. Use PARTITION BY to scope clustering within regions or categories and keep each window small.