PostGIS: spatial SQL fundamentals
PostGIS turns PostgreSQL into a full GIS engine. Spatial data types, ST_ functions, GIST indexes — the toolkit for serious geospatial backends.
If you were a state agency tracking which streams empty into which bays in Hawaiʻi, how would you store that data — a spreadsheet, or a database?
Spreadsheets work until they don't. PostGIS is what real geospatial teams use when 'works on my laptop' isn't good enough. This week, you'll learn the queries that power production GIS — the same kind that monitor stream-flow alerts statewide.
Learning objectives
- Install PostGIS and load a shapefile
- Write SELECT queries using ST_Within, ST_Distance, ST_Intersects
- Create a spatial index (GIST) and explain its impact
- Build a query that finds all launches within 100 km of a coastline
Primer
PostGIS is the spatial extension to PostgreSQL. It turns the world's most-loved relational database into a full-featured GIS engine: spatial data types, hundreds of ST_* functions, GIST indexes for sub-millisecond spatial queries, and the ability to combine all of that with the regular relational and JSON capabilities you already use Postgres for.
For any serious space-GIS backend — including LaunchDetect's production pipeline — PostGIS is the default. This week is your fluency primer.
Installation
The easiest path is Docker:
docker run -d --name pg-spatial \
-e POSTGRES_PASSWORD=academy \
-p 5432:5432 \
postgis/postgis:16-3.4
Then enable PostGIS in your database:
CREATE EXTENSION IF NOT EXISTS postgis;
SELECT PostGIS_Version(); -- confirms installation
Spatial data types
PostGIS adds three core types: GEOMETRY (planar), GEOGRAPHY (geodesic), and the array variants. The decision between geometry and geography is the most important one in your schema:
- GEOMETRY — fast, works in any projected coordinate system. Use for local-scale analysis (within a UTM zone, within a city).
- GEOGRAPHY — slower, but distances and areas are correct on the WGS84 ellipsoid regardless of where on Earth. Use for global-scale analysis.
Define a launch detections table:
CREATE TABLE detections (
id bigserial PRIMARY KEY,
detected_at timestamptz NOT NULL,
vehicle text,
brightness_k real,
-- WGS84 geographic; SRID 4326 = lat/lon on WGS84
position geometry(Point, 4326) NOT NULL
);
CREATE INDEX detections_position_gix ON detections USING GIST (position);
Loading data
Load a GeoJSON FeatureCollection via ogr2ogr (from GDAL):
ogr2ogr -f PostgreSQL "PG:host=localhost user=postgres dbname=academy" \
detections.geojson -nln detections -append
Or via Python with geopandas:
import geopandas as gpd
from sqlalchemy import create_engine
gdf = gpd.read_file('detections.geojson')
engine = create_engine('postgresql://postgres:academy@localhost:5432/academy')
gdf.to_postgis('detections', engine, if_exists='replace', index=False)
The ST_* family
The functions you'll use 90% of the time:
ST_Within(a, b)— is a entirely inside b?ST_Intersects(a, b)— do a and b share any point?ST_Distance(a, b)— distance between a and b (units depend on SRID; geography type returns meters)ST_Buffer(geom, dist)— expand a geometry by dist (planar; for geodesic, cast to geography)ST_DWithin(a, b, dist)— true if a and b are within dist of each other (uses index efficiently)ST_Transform(geom, srid)— reproject between coordinate systems
GIST indexes: the secret to speed
Without a spatial index, every spatial query is a full table scan. With a GIST index, PostgreSQL uses an R-tree to prune to candidate features in O(log n) time. Always create a GIST index on the geometry column. Always.
The lab
You'll load the Natural Earth global coastline polygons table and the LaunchDetect launches table into PostGIS, then write the spatial SQL query that finds every detection within 100 km of any coastline, sorted by distance. This single query is the core of LaunchDetect's coastal-spaceport heuristic for ranking detection confidence (a thermal hotspot 500 km from any coast is more likely a wildfire than a launch).
Connecting to Hawaiʻi: PostGIS and the Commission on Water Resource Management
The State of Hawaiʻi's Commission on Water Resource Management tracks every stream gauge, every aquifer, every well permit. Behind it is a PostGIS-style spatial database that lets staff ask questions like 'which stream gauges show critical low-flow conditions right now AND are upstream of a community water intake?' That is exactly the kind of query you'll write this week. The query pattern is universal — replace 'stream gauge' with 'launch detection' and you have LaunchDetect's production database.
Hands-on lab: Find every launch within 100 km of a coastline
Load a global coastlines table and a launch-detection points table into PostGIS. Write the spatial SQL query that returns all launches within 100 km of any coastline, sorted by distance.
Quiz — click an answer to check it
No grade, no shame. Tap any option; you'll see if it's right plus the answer if not. The point is to notice what you already know and what's still settling.
- JSON support
- Spatial data types and functions
- Time series
- Web hosting
- Always meters
- Always degrees
- The units of the input geometry's SRID
- Kilometers
- Numeric columns
- Spatial columns
- Text search
- Sequence generation
- a is fully inside b
- a touches b
- a equals b
- a is north of b
- WGS84 lat/lon
- Web Mercator
- UTM Zone 26
- OSGB36
Reflection
Take five minutes with this. Write your answer somewhere. Carry it into next week.