Rasters in Parquet

Query rasters with SQL. Treat rasters as tables. Bring raster data into the lakehouse.

What is RaQuet?

Specification (0.4.0)

RaQuet defines an open specification for storing raster data in Apache Parquet. Each tile becomes a row, each band becomes a column. Standard format, no proprietary extensions.

Read the specification →

Tools

Convert any GDAL-supported raster (GeoTIFF, NetCDF, COG) to RaQuet. Query with DuckDB, visualize in the browser, or load into your data warehouse.

View CLI reference →

Ecosystem

RaQuet is designed to work with the modern analytics stack. Full support for BigQuery, Snowflake, Databricks, DuckDB, and PostgreSQL through CARTO's Analytics Toolbox.

See supported engines →

Why Parquet for Raster Data?

We believe people want to access their raster data like any other type of data: in SQL.

You shouldn’t have to export data and perform vector-raster intersections outside your analytics platform. But today, you can’t just query a raster. Raster data remains locked in GIS- and HPC-oriented formats like GeoTIFF/COG and Zarr — powerful, but largely invisible to SQL engines.

RaQuet builds on the pioneering work of PostGIS Raster, which first demonstrated SQL-based raster analytics. But instead of being tied to PostgreSQL, RaQuet uses Apache Parquet — an open columnar format supported by virtually every modern analytics engine.

Key insight: With RaQuet, raster files become tables in your data warehouse. Instead of treating rasters as opaque files, you can query them, join them with vector data, and govern them — all in the same system.


RaQuet Principles

RaQuet’s goal is to align GIS with the rest of the analytics industry — particularly Open Table Formats like Apache Iceberg and the separation of storage and compute.


RaQuet vs COG vs Zarr

RaQuet isn’t competing with traditional raster formats — it targets a different problem entirely: interoperability in the analytics world.

  COG (GeoTIFF) Zarr RaQuet
Best for GIS pipelines, visualization Scientific computing (HPC) Analytics / lakehouse / SQL
Ecosystem GDAL, QGIS, rasterio Xarray, Dask, Pangeo DuckDB, BigQuery, Snowflake, Spark
Strength Window reads, tiling, overviews Chunked arrays, parallel compute SQL queries, joins with vector data
Limitation Not queryable in SQL Requires specialized runtimes Designed for tiles, not window reads

RaQuet works out of the box in most analytics systems — and often provides comparable or better performance than pipelines involving export/import steps.


Sample Data

Try these example RaQuet files — query them directly from cloud storage:

Dataset Source Source Size RaQuet Size URL
World Elevation AAIGrid 3.2 GB 805 MB world_elevation.parquet
World Solar PVOUT AAIGrid 2.8 GB 255 MB world_solar_pvout.parquet
CFSR SST NetCDF 854 MB 75 MB cfsr_sst.parquet
TCI (Sentinel-2) GeoTIFF 224 MB 256 MB TCI.parquet
Spain Solar GHI GeoTIFF 15 MB spain_solar_ghi.parquet

Data sources: Global Solar Atlas, Copernicus Sentinel-2, CFSR Reanalysis


Example Queries

Get Elevation at Madrid

LOAD raquet;

WITH point AS (
    SELECT 'POINT(-3.7038 40.4168)'::GEOMETRY AS geom
)
SELECT
    ST_RasterValue(block, band_1, point.geom, metadata) AS elevation_meters
FROM read_raquet('https://storage.googleapis.com/raquet_demo_data/world_elevation.parquet')
CROSS JOIN point
WHERE ST_RasterIntersects(block, point.geom);

Sum Solar Potential in a Region

LOAD raquet;

WITH area AS (
    SELECT ST_GeomFromText('POLYGON((-4 40, -3 40, -3 41, -4 41, -4 40))') AS geom
)
SELECT
    SUM(ST_RasterSummaryStat(block, band_1, 'sum', metadata)) AS total_pvout
FROM read_raquet('https://storage.googleapis.com/raquet_demo_data/world_solar_pvout.parquet')
CROSS JOIN area
WHERE ST_RasterIntersects(block, area.geom);

Time-Series Analysis

LOAD raquet;

SELECT
    YEAR(time_ts) AS year,
    AVG(ST_RasterSummaryStat(block, band_1, 'mean', metadata)) AS avg_sst
FROM read_raquet('https://storage.googleapis.com/raquet_demo_data/cfsr_sst.parquet')
GROUP BY YEAR(time_ts)
ORDER BY year;

Key functions:


How It Works

Think of RaQuet as storing a raster where each tile is a row and each band is a column.

RaQuet converts raster tiles to Parquet table rows

RaQuet requires Web Mercator (EPSG:3857) projection to leverage QUADBIN spatial indexing for efficient filtering.


Getting Started

# Install
pip install raquet-io

# Convert a raster to RaQuet
raquet-io convert raster input.tif output.parquet

# Validate the output
raquet-io validate output.parquet

# Inspect metadata
raquet-io inspect output.parquet
Try the Viewer CLI Reference

Roadmap: Apache Iceberg Integration

Status: Active development — not yet generally available.

We’re working on registering RaQuet datasets as Apache Iceberg tables — enabling rasters to be discovered and queried alongside vector data in your lakehouse.

GeoParquet brought vector data into the lakehouse. RaQuet does the same for raster. Iceberg unifies them under a single governance layer.

Follow progress on GitHub →


Changelog

v0.4.0 (Experimental)

v0.3.0

v0.2.0


Acknowledgments

Special thanks to Even Rouault for his invaluable feedback on the RaQuet specification. His deep expertise in geospatial formats and GDAL has helped shape RaQuet into a more robust and well-documented standard.