Memory Limits for Large Raster Data

Large rasters only exhaust DuckDB’s heap after they are flattened into pixel tables, and this page isolates that exact spill threshold — the raster-specific deep dive behind in-memory vs disk storage and the question of when a rasterized scan should stay resident versus stream to a temporary directory.

Root-Cause Analysis: Heap Exhaustion in Raster Pipelines

DuckDB’s spatial extension has no raster type and no raster reader — ST_Read is a GDAL/OGR vector reader, not a raster one. The memory problem therefore appears one step later: rasters must be converted to a tabular form (pixel points, tiles, or zonal statistics), and that tabular product can be enormous. A single 10,000×10,000 three-band image expands to 300M pixel rows; loaded eagerly into DuckDB without bounds it triggers std::bad_alloc or the OS OOM killer.

The failure modes for this exact operation cluster into four causes, each with a distinct fix:

Unbounded eager load. A CREATE TABLE ... AS SELECT over the full pixel export materializes every row before any operator can spill, so the engine never gets the chance to route to disk. Fix: stream the conversion through COPY ... TO Parquet under an explicit memory_limit.
Row explosion from band count. Each band multiplies the row count linearly; a careless three-band export triples peak RAM versus reading one band at a time. Fix: convert and load per band, or pivot bands to columns.
Decompression buffer multiplication. High threads plus a large GDAL tile cache holds many in-flight decompression buffers at once during conversion. Fix: cap GDAL_CACHEMAX and lower threads.
CRS-induced coordinate blow-up. A mismatched or missing source CRS produces shifted or out-of-range coordinates that defeat row-group pruning downstream, so every scan reads everything. Fix: validate and normalize the CRS at conversion time, as covered under CRS mapping and transformations.

The cure is the same shape in every case: keep the raster→table conversion outside DuckDB (the GDAL CLI), write compressed Parquet, then read it under explicit memory governance so the engine spills gracefully rather than accumulating the full footprint in the process heap.

Deterministic Configuration

Bound DuckDB’s memory and route intermediate results to a high-throughput NVMe-backed temporary directory. This is the minimal session needed for the patterns below; the general guardrail rationale lives in the parent in-memory vs disk storage reference.

-- Minimal session for memory-bounded raster pixel-table ingestion.
SET memory_limit = '6GB';          -- Hard ceiling: the engine switches to external
                                   -- spilling near this bound instead of OOM-killing.
SET temp_directory = '/mnt/duckdb_io/temp'; -- Spills land here; must be NVMe or scratch
                                   -- I/O becomes the latency floor for the whole load.
SET preserve_insertion_order = false; -- Pixel order is irrelevant; dropping it frees the
                                   -- writer to flush row groups without buffering.
SET threads = 4;                   -- Each pixel-scan pipeline holds its own buffers, so
                                   -- fewer threads LOWER peak RAM on a wide pixel table.

Configure GDAL’s cache through environment variables for the preprocessing step — these affect the GDAL CLI, not DuckDB:

export GDAL_CACHEMAX=512                 # Cap tile cache (MB) so conversion streams
                                         # instead of buffering the whole raster.
export GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR  # Skip recursive dir scans on object
                                               # storage; avoids thousands of stat calls.
export VSI_CACHE=TRUE                     # Cache remote reads to cut redundant range GETs.
export VSI_CACHE_SIZE=536870912           # 512 MB VSI cache ceiling, bounding remote RAM.

These cap GDAL’s tile cache during conversion, disable recursive directory scans on remote object storage, and keep the translation streaming. DuckDB then reads the resulting Parquet and spills intermediate blocks to temp_directory rather than accumulating them in RAM.

Optimized Execution Pattern

DuckDB cannot open a GeoTIFF directly, so the conversion happens in the GDAL CLI; the difference between an OOM and a stable load is entirely in how the resulting pixels enter the engine. The before/after below isolates that change.

# Conversion step (GDAL CLI, outside DuckDB), shared by both variants.
# 1. Reproject + tile into a Cloud Optimized GeoTIFF (streaming, bounded cache).
gdalwarp -t_srs EPSG:4326 -of COG \
  -co BLOCKSIZE=2048 -co COMPRESS=ZSTD \
  /data/large_raster.tif /data/large_raster_4326.tif

# 2. Emit pixel coordinates + values as XYZ CSV (one band).
gdal2xyz.py -band 1 /data/large_raster_4326.tif /data/pixels_band1.csv

Before — eager load materializes every pixel in RAM (OOM):

-- Anti-pattern: the whole pixel table is built in memory before anything can spill.
CREATE TABLE raster_pixels AS
SELECT ST_Point(column0, column1) AS geom, column2 AS band_value
FROM read_csv('/data/pixels_band1.csv', header = false,
              columns = {'column0': 'DOUBLE', 'column1': 'DOUBLE', 'column2': 'DOUBLE'});
-- 300M rows of GEOMETRY + DOUBLE resident at once -> std::bad_alloc.

After — stream straight to compressed Parquet under the memory ceiling:

import duckdb

con = duckdb.connect()
con.execute("SET memory_limit = '6GB';")
con.execute("SET temp_directory = '/mnt/duckdb_io/temp';")
con.execute("SET threads = 4;")
con.execute("SET preserve_insertion_order = false;")
con.execute("INSTALL spatial; LOAD spatial;")

# Key change: COPY ... TO streams row groups to disk as they are produced, so the
# resident set stays near a single buffer instead of the full 300M-row table.
# gdal2xyz output columns: x (longitude), y (latitude), value.
con.execute("""
COPY (
    SELECT
        ST_Point(column0, column1) AS geom,  -- x, y from gdal2xyz
        column2 AS band_value
    FROM read_csv('/data/pixels_band1.csv', header = false,
                  columns = {'column0': 'DOUBLE', 'column1': 'DOUBLE', 'column2': 'DOUBLE'})
) TO '/output/processed_raster.parquet' (FORMAT PARQUET, COMPRESSION ZSTD);
""")
con.close()

The annotated difference: CREATE TABLE AS holds the entire result in the buffer manager and only spills if an operator needs to, whereas COPY ... TO Parquet pushes finished row groups to disk continuously, so peak RAM tracks one row group rather than the whole table. Writing it as a GeoParquet dataset also preserves columnar layout and defers WKB deserialization, so the next read starts memory-friendly. If downstream consumers need vector overlays on the result, build the R-tree spatial index post-ingestion on the Parquet output, never during conversion.

Diagnostic Queries & Plan Validation

Validate that the configuration actually applied, then watch memory pressure before running a production pass.

-- Verify active session parameters took effect.
SELECT name, value
FROM duckdb_settings()
WHERE name IN ('memory_limit', 'temp_directory', 'threads');

-- Inspect the pixel-table scan; look for External operators and high-cardinality nodes.
EXPLAIN ANALYZE
SELECT band_value, count(*)
FROM read_parquet('/output/processed_raster.parquet')
GROUP BY band_value;

# Monitor process RSS during execution (Python side).
import psutil, os
print(psutil.Process(os.getpid()).memory_info().rss / 1024**3, "GB")

Read the result against these thresholds:

Scan node consuming >75% of memory_limit — the pixel table is too wide for the budget. Drop threads to 2 and read fewer partitions per query (filter by tile or band).
External prefix on the aggregation or sort — the GROUP BY spilled. Acceptable for a one-off batch pass; a problem if it recurs on every run.
RSS climbing without plateau during a streaming COPY — the writer is buffering; confirm preserve_insertion_order = false and that the source read is not re-sorting.

A live spill probe complements the plan: SELECT * FROM duckdb_temporary_files(); growing during a read you expected to stream is the signal the working set has breached RAM.

Geometry Validation & Fallback Routing

Pixel-to-point conversion can emit degenerate geometries — NaN coordinates from nodata cells, or points shifted out of valid range by a CRS error. Guard the geometry before it reaches an index or a join, and route around OOM when a single tile is still too large.

-- Guard: count invalid / out-of-range pixel geometries before downstream use.
SELECT
    count(*) FILTER (WHERE NOT ST_IsValid(geom))                       AS invalid_geom,
    count(*) FILTER (WHERE ST_X(geom) NOT BETWEEN -180 AND 180
                       OR  ST_Y(geom) NOT BETWEEN -90  AND 90)         AS out_of_range
FROM read_parquet('/output/processed_raster.parquet');

-- Repair / drop on the way through, so only clean geometry is materialized.
CREATE OR REPLACE TABLE raster_points AS
SELECT ST_MakeValid(geom) AS geom, band_value
FROM read_parquet('/output/processed_raster.parquet')
WHERE ST_IsValid(geom)
  AND ST_X(geom) BETWEEN -180 AND 180
  AND ST_Y(geom) BETWEEN -90 AND 90;

When OOM termination persists after the baseline configuration, escalate in order of leverage:

Thread and cache reduction. Lower threads to 2 and set GDAL_CACHEMAX=256. High thread counts multiply decompression buffers linearly during conversion.
Explicit chunked tiling. Convert in tiles so each Parquet partition stays small, then load partition-by-partition rather than all at once:
```
gdal_retile.py -ps 2048 2048 -targetDir /data/tiles /data/large_raster_4326.tif
```
CRS-mismatch diagnosis. Mismatched EPSG codes or missing .aux.xml sidecars produce silently shifted coordinates that also break row-group pruning. Validate the source with gdalinfo -json /data/input.tif before conversion; see CRS mapping and transformations for normalization.
I/O path verification. Confirm temp_directory resides on NVMe with >500 MB/s sequential write throughput. HDD-backed temp paths cause stalls that masquerade as heap exhaustion.
Container isolation. Mount temp_directory as a dedicated volume with noexec,nosuid and chmod 0750 on spill directories — the same isolation discipline used when setting up the DuckDB Spatial CLI for multi-tenant work.
Read COG tiles on demand. For very large mosaics, keep the COG on object storage and convert only the tiles a query needs (gdalwarp -te <bbox> ...), then read_parquet the per-tile output.

See also:

Setting up the DuckDB Spatial CLI — the hardened session and storage-isolation baseline these raster pipelines build on.
GeoParquet parsing in DuckDB Spatial — columnar ingestion that keeps the post-conversion reads memory-friendly.
CRS mapping and transformations — fixing the coordinate errors that defeat pruning on pixel tables.

Up: In-Memory vs Disk Storage in DuckDB · DuckDB Spatial Architecture & Fundamentals

External Reference Standards: GDAL cache and conversion semantics follow the official GDAL configuration options; session-parameter precedence (session SET overrides config-file defaults) follows the DuckDB configuration reference.

Memory Limits for Large Raster Data

Root-Cause Analysis: Heap Exhaustion in Raster Pipelines #

Deterministic Configuration #

Optimized Execution Pattern #

Diagnostic Queries & Plan Validation #

Geometry Validation & Fallback Routing #

Related #

Root-Cause Analysis: Heap Exhaustion in Raster Pipelines

Deterministic Configuration

Optimized Execution Pattern

Diagnostic Queries & Plan Validation

Geometry Validation & Fallback Routing

Related