GDALVector Processing

ogrmerge

What is ogrmerge?

ogrmerge.py combines several vector datasets into one output. By default it copies each input as a separate layer; with -single it concatenates all features into a single layer, optionally recording the source file name per feature. It is the complement to ogr2ogr -append — more convenient when you have many inputs and want a single invocation.

Shell
ogrmerge.py -o <output_datasource> [options] <src_1> [<src_2>...]

Commonly used options:

  • -o <output> — output datasource
  • -f <format> — output OGR driver (GPKG, ESRI Shapefile, …)
  • -single — merge all inputs into one layer
  • -nln <template> — output layer name; may include {DS_NAME}, {LAYER_NAME}, {DS_BASENAME} placeholders
  • -src_layer_field_name <name> — add a field holding the source layer name
  • -src_layer_field_content <template> — content for that field
  • -s_srs <CRS> / -t_srs <CRS> / -a_srs <CRS> — source/target/assign CRS
  • -update — open existing output datasource for write
  • -overwrite_ds — replace output datasource
  • -overwrite_layer / -append — layer-level behaviour
  • -field_strategy <Union|Intersection|FirstLayer> — schema reconciliation

When would you use ogrmerge?

Use ogrmerge.py to consolidate many similar vector deliveries into one dataset. Typical jobs: merging per-state Shapefiles into one national GeoPackage with a source column so you can still trace each feature back (ogrmerge.py -o usa.gpkg -f GPKG -single -src_layer_field_name source states/*.shp), combining daily observation CSVs into one cumulative layer, or flattening a folder of tile-by-tile deliveries into a single PostGIS table.

-field_strategy Union is the right choice when input schemas differ but you want all columns represented (missing fields become NULL per feature). Intersection keeps only fields common to every input — safer when schemas drift unpredictably. For simpler cases where every input has identical schema and the same geometry type, -single with default strategy just works.

FAQs

ogrmerge vs ogr2ogr -append — which should I use?

ogr2ogr -append loops feature-by-feature against a predefined output schema; perfectly fine for one append but awkward when you have many files. ogrmerge.py is designed for the many-inputs case, can create the output layer automatically from the first input, and supports source-tracking columns. For 2–3 inputs either works; for 50+ files, ogrmerge.py is much cleaner.

How do I retain which file each feature came from?

Add -src_layer_field_name source -src_layer_field_content "{DS_BASENAME}" and every feature gets a source attribute recording its originating file's basename. Critical for QA when merging many deliveries.

How does -field_strategy handle schema mismatches?

Union creates a superset schema with all fields from all inputs (NULL where a field is absent). Intersection keeps only fields present in every input. FirstLayer uses the first input's schema and drops anything not present there. Pick based on whether you value completeness or consistency more.

Can ogrmerge reproject on the fly?

Yes — -t_srs reprojects all sources to a common CRS during merge. Use when inputs arrive in mixed CRSes. Combine with -s_srs if a specific source lacks CRS metadata and needs to be force-declared.