Julia bindings for tree-sitter — "An incremental parsing system for programming tools."
This package is registered in the Julia General registry and can be installed using:
pkg> add TreeSitter
Additionally, you need to install the language parser(s) you want to use:
pkg> add tree_sitter_julia_jll tree_sitter_c_jll
Breaking change in v0.2: Language parsers are no longer bundled with TreeSitter.jl. You must now:
- Install the specific language JLL packages you need
- Import them explicitly in your code
- Pass the JLL module to the parser constructor
using TreeSitter
parser = Parser(:julia) # Deprecated - will show warningusing TreeSitter, tree_sitter_julia_jll
parser = Parser(tree_sitter_julia_jll)The symbol-based API still works but is deprecated and will be removed in a future version.
julia> using TreeSitter, tree_sitter_c_jll
julia> c = Parser(tree_sitter_c_jll)
Parser(Language(:c))
julia> ast = parse(c, "int x;")
(translation_unit (declaration type: (primitive_type) declarator: (identifier)))
julia> using tree_sitter_json_jll
julia> json = Parser(tree_sitter_json_jll)
Parser(Language(:json))
julia> ast = parse(json, "{\"key\": [1, 2]}")
(document (object (pair key: (string (string_content)) value: (array (number) (number)))))
julia> traverse(ast) do node, enter
if enter
@show node
end
end
node = (document (object (pair key: (string (string_content)) value: (array (number) (number)))))
node = (object (pair key: (string (string_content)) value: (array (number) (number))))
node = ("{")
node = (pair key: (string (string_content)) value: (array (number) (number)))
node = (string (string_content))
node = ("\"")
node = (string_content)
node = ("\"")
node = (":")
node = (array (number) (number))
node = ("[")
node = (number)
node = (",")
node = (number)
node = ("]")
node = ("}")
julia> using tree_sitter_julia_jll
julia> julia_parser = Parser(tree_sitter_julia_jll)
Parser(Language(:julia))
julia> ast = parse(julia_parser, "f(x)")
(source_file (call_expression (identifier) (argument_list (identifier))))
julia> traverse(ast, named_children) do node, enter
if !enter
@show node
end
end
node = (identifier)
node = (identifier)
node = (argument_list (identifier))
node = (call_expression (identifier) (argument_list (identifier)))
node = (source_file (call_expression (identifier) (argument_list (identifier))))
TreeSitter.jl supports any tree-sitter language parser packaged as a JLL. The following are available:
| Language | JLL Package |
|---|---|
| Bash | tree_sitter_bash_jll |
| C | tree_sitter_c_jll |
| C++ | tree_sitter_cpp_jll |
| Go | tree_sitter_go_jll |
| HTML | tree_sitter_html_jll |
| Java | tree_sitter_java_jll |
| JavaScript | tree_sitter_javascript_jll |
| JSON | tree_sitter_json_jll |
| Julia | tree_sitter_julia_jll |
| PHP | tree_sitter_php_jll |
| Python | tree_sitter_python_jll |
| Ruby | tree_sitter_ruby_jll |
| Rust | tree_sitter_rust_jll |
| TypeScript | tree_sitter_typescript_jll |
Install only the languages you need:
pkg> add tree_sitter_julia_jll tree_sitter_python_jll
Additional languages can be added by writing new jll packages to wrap the
upstream parsers: see Yggdrasil
for details.
Some language packages provide multiple parser variants. For example, tree_sitter_php_jll provides both php (with HTML support) and php_only (pure PHP) parsers.
Discover available parsers:
julia> using TreeSitter, tree_sitter_php_jll
julia> list_parsers(tree_sitter_php_jll)
2-element Vector{Symbol}:
:php
:php_onlyUse a specific parser variant:
julia> # Default parser (php with HTML support)
julia> p1 = Parser(tree_sitter_php_jll)
Parser(Language(:php))
julia> # PHP-only variant
julia> p2 = Parser(tree_sitter_php_jll, :php_only)
Parser(Language(:php_only))The same variant parameter works for Language and Query constructors:
julia> lang = Language(tree_sitter_php_jll, :php_only)
Language(:php_only)
julia> query = Query(tree_sitter_php_jll, "(identifier) @id", :php_only)
Query(Language(:php_only))For grammars not yet packaged as JLLs, load parsers directly from local tree-sitter grammar repositories:
# Clone and build the grammar
# $ git clone https://github.com/tree-sitter/tree-sitter-python
# $ cd tree-sitter-python && tree-sitter build
using TreeSitter
parser = Parser("/path/to/tree-sitter-python")
tree = parse(parser, "def foo(): pass")Requirements:
- Repository must contain
tree-sitter.json(tree-sitter v0.21+ format) - Shared library must be built (
tree-sitter buildormake)
Multi-grammar repositories:
# tree-sitter-php has both :php and :php_only variants
parser = Parser("/path/to/tree-sitter-php", :php_only)Query files from the repository's queries/ directory are automatically loaded.
TreeSitter.jl supports tree-sitter query predicates for filtering matches and attaching metadata to patterns.
String Comparison:
#eq?- String equality:(#eq? @var "foo")#not-eq?- String inequality:(#not-eq? @method "constructor")#any-of?- Multi-value equality:(#any-of? @type "int" "void" "char")
Pattern Matching:
#match?- Regex match:(#match? @lowercase "^[a-z]+$")#not-match?- Negated regex:(#not-match? @public "^_")
Node Properties:
#is?- Property assertion:(#is? @node "named")#is-not?- Negated property:(#is-not? @node "extra")
Only built-in properties are checked: named, missing, extra
Tree Structure:
#has-ancestor?- Ancestor check:(#has-ancestor? @indexer index_expression)
Quantified Predicates:
For patterns with quantified captures (e.g., (comment)+ @comments), these predicates check if the condition holds for ANY of the captured nodes:
#any-eq?- ANY capture equals value:(#any-eq? @comments "// TODO")#any-not-eq?- ANY capture not equal:(#any-not-eq? @ids "reserved")#any-match?- ANY capture matches regex:(#any-match? @comments "TODO")#any-not-match?- ANY capture doesn't match:(#any-not-match? @lines "^\\s*$")
Example usage:
# Match comment blocks where at least one comment contains "TODO"
q = query```
((comment)+ @comments
(#any-match? @comments "TODO"))
```julia