Skip to content

caicancai/pg-parser-rs

Repository files navigation

pg-parser-rs logo

pg-parser-rs

PostgreSQL-flavored SQL parser based on tree-sitter, with a Rust AST layer.

CI crates.io docs.rs license

Docs · Crate · Issues

Features

  • Exposes the SQL Language for tree-sitter
  • Provides a simple parse helper and a PgParser wrapper
  • Builds a PG-flavored AST for SELECT/INSERT/UPDATE/DELETE

Install

cargo add pg-parser-rs

Usage (Rust)

let mut parser = tree_sitter::Parser::new();
parser.set_language(&pg_parser_rs::language())?;
let tree = parser.parse("SELECT 1;", None).unwrap();

Or use the wrapper for a simpler API:

let mut parser = pg_parser_rs::PgParser::new()?;
let tree = parser.parse("SELECT 1;").unwrap();

API (AST + Diagnostics)

Parse into typed AST:

let statements = pg_parser_rs::parse_statements("SELECT 1;");

Parse and collect diagnostics (syntax + unsupported):

let result = pg_parser_rs::parse_statements_with_diagnostics("SELECT FROM t");
for err in result.errors {
    println!("{:?} {:?}", err.kind, err.span);
}

Parse a single query with diagnostics:

let result = pg_parser_rs::parse_query_with_diagnostics("SELECT * FROM t");
let query = result.query;
let errors = result.errors;

API (Visitor + Analysis)

This crate includes a minimal AST visitor and a few “analysis helpers” that are useful for lineage/rewrite/optimizer-style tooling.

Extract column references across a statement (SELECT/WHERE/JOIN/GROUP BY/HAVING/ORDER BY/etc):

use pg_parser_rs::{extract_column_refs, parse_statements, Statement};

let sql = "SELECT t.a, upper(b) FROM t JOIN u ON u.id = t.id WHERE b > 1";
let stmt = parse_statements(sql).into_iter().find(|s| matches!(s, Statement::Query(_))).unwrap();
let cols = extract_column_refs(&stmt);

for c in cols {
    println!("{}", c.to_sql_string());
}

Spans

Most AST nodes carry a Span with byte/line offsets into the original SQL string. For synthetic or default nodes that don’t map to a concrete source range, Span::UNKNOWN is used.

Node types JSON (for tooling):

let json = pg_parser_rs::NODE_TYPES;

Architecture

  • Parsing: tree-sitter builds a concrete syntax tree (CST) from SQL input.
  • AST: ast_builder walks the CST and produces a typed AST in src/ast.rs.
  • Scope: SELECT/CTE/query expressions plus DML (INSERT/UPDATE/DELETE); unsupported syntax maps to Statement::Unknown.
  • Extensibility: #[non_exhaustive] on key enums and Unknown variants allow incremental support.

Example: traverse and extract AST nodes

use pg_parser_rs::{parse_statements, Expr, SelectItem, Statement};

fn main() {
    let sql = "SELECT a, b + 1 AS c FROM t WHERE b > 10 ORDER BY c DESC";
    let statements = parse_statements(sql);

    for stmt in statements {
        if let Statement::Query(query) = stmt {
            if let pg_parser_rs::SetExpr::Select(select) = query.body {
                for item in select.projection {
                    match item {
                        SelectItem::UnnamedExpr(expr) | SelectItem::ExprWithAlias { expr, .. } => {
                            if let Expr::Identifier(ident) = expr {
                                println!("select column: {}", ident.value);
                            }
                        }
                        _ => {}
                    }
                }
            }
        }
    }
}

Notes

The grammar lives under grammar/ and is compiled via build.rs. Generated sources (grammar/src/parser.c, grammar/src/node-types.json, grammar/src/grammar.json) are not checked into the repo.

Build requirements

This crate runs tree-sitter generate during build. Install the CLI:

cargo install tree-sitter-cli

You can also point TREE_SITTER_CLI to a custom binary.

Development flow

After changing grammar/grammar.js or grammar/src/scanner.cc, regenerate:

tree-sitter generate

Then sync generated sources into generated/ for builds and packaging:

mkdir -p generated/tree_sitter
cp grammar/src/parser.c generated/
cp grammar/src/scanner.cc generated/
cp grammar/src/node-types.json generated/
cp grammar/src/grammar.json generated/
cp grammar/src/tree_sitter/*.h generated/tree_sitter/

License

Apache-2.0 (see LICENSE).

About

PostgreSQL-flavored SQL parser based on tree-sitter, with a Rust AST layer

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors