PostgreSQL-flavored SQL parser based on tree-sitter, with a Rust AST layer.
- Exposes the SQL
Languagefor tree-sitter - Provides a simple
parsehelper and aPgParserwrapper - Builds a PG-flavored AST for SELECT/INSERT/UPDATE/DELETE
cargo add pg-parser-rslet mut parser = tree_sitter::Parser::new();
parser.set_language(&pg_parser_rs::language())?;
let tree = parser.parse("SELECT 1;", None).unwrap();Or use the wrapper for a simpler API:
let mut parser = pg_parser_rs::PgParser::new()?;
let tree = parser.parse("SELECT 1;").unwrap();Parse into typed AST:
let statements = pg_parser_rs::parse_statements("SELECT 1;");Parse and collect diagnostics (syntax + unsupported):
let result = pg_parser_rs::parse_statements_with_diagnostics("SELECT FROM t");
for err in result.errors {
println!("{:?} {:?}", err.kind, err.span);
}Parse a single query with diagnostics:
let result = pg_parser_rs::parse_query_with_diagnostics("SELECT * FROM t");
let query = result.query;
let errors = result.errors;This crate includes a minimal AST visitor and a few “analysis helpers” that are useful for lineage/rewrite/optimizer-style tooling.
Extract column references across a statement (SELECT/WHERE/JOIN/GROUP BY/HAVING/ORDER BY/etc):
use pg_parser_rs::{extract_column_refs, parse_statements, Statement};
let sql = "SELECT t.a, upper(b) FROM t JOIN u ON u.id = t.id WHERE b > 1";
let stmt = parse_statements(sql).into_iter().find(|s| matches!(s, Statement::Query(_))).unwrap();
let cols = extract_column_refs(&stmt);
for c in cols {
println!("{}", c.to_sql_string());
}Most AST nodes carry a Span with byte/line offsets into the original SQL string. For synthetic
or default nodes that don’t map to a concrete source range, Span::UNKNOWN is used.
Node types JSON (for tooling):
let json = pg_parser_rs::NODE_TYPES;- Parsing: tree-sitter builds a concrete syntax tree (CST) from SQL input.
- AST:
ast_builderwalks the CST and produces a typed AST insrc/ast.rs. - Scope: SELECT/CTE/query expressions plus DML (INSERT/UPDATE/DELETE); unsupported syntax maps to
Statement::Unknown. - Extensibility:
#[non_exhaustive]on key enums andUnknownvariants allow incremental support.
use pg_parser_rs::{parse_statements, Expr, SelectItem, Statement};
fn main() {
let sql = "SELECT a, b + 1 AS c FROM t WHERE b > 10 ORDER BY c DESC";
let statements = parse_statements(sql);
for stmt in statements {
if let Statement::Query(query) = stmt {
if let pg_parser_rs::SetExpr::Select(select) = query.body {
for item in select.projection {
match item {
SelectItem::UnnamedExpr(expr) | SelectItem::ExprWithAlias { expr, .. } => {
if let Expr::Identifier(ident) = expr {
println!("select column: {}", ident.value);
}
}
_ => {}
}
}
}
}
}
}The grammar lives under grammar/ and is compiled via build.rs.
Generated sources (grammar/src/parser.c, grammar/src/node-types.json,
grammar/src/grammar.json) are not checked into the repo.
This crate runs tree-sitter generate during build. Install the CLI:
cargo install tree-sitter-cliYou can also point TREE_SITTER_CLI to a custom binary.
After changing grammar/grammar.js or grammar/src/scanner.cc, regenerate:
tree-sitter generateThen sync generated sources into generated/ for builds and packaging:
mkdir -p generated/tree_sitter
cp grammar/src/parser.c generated/
cp grammar/src/scanner.cc generated/
cp grammar/src/node-types.json generated/
cp grammar/src/grammar.json generated/
cp grammar/src/tree_sitter/*.h generated/tree_sitter/Apache-2.0 (see LICENSE).