Problem
It seems like tree-sitter highlighting crashes when trying to highlight a large Zig file both on GitHub and in my editor. I don't think this is specific to Zig based on the below results but the highlighting does seem a bit slow and I couldn't reproduce this with other languages, so it could be that the Zig highlighting queries are the issue.
I tried to reduce this file as much as I could in the reproduction to about 1200 lines but it seems like getting rid of any more gets the segfault to go away. I tried debugging in LLDB and I'm not too familiar with the code but it seems like right in ts_query_cursor__compare_captures is NULL perhaps because of some boundary case with NONE/UINT32_MAX?
leave node. depth:19, type:,
enter node. depth:19, type:}, field:(null), row:1214 state_count:28, finished_state_count:65529
start state. pattern:68, step:404
discard state. pattern:41, step:154
capture node. type:}, pattern:68, capture_id:41, capture_count:2
advance state. pattern:68, step:405
keep state. pattern: 3, start_depth: 1, step_index: 9, capture_count: 4294967295
keep state. pattern: 4, start_depth: 1, step_index: 12, capture_count: 4294967295
keep state. pattern: 7, start_depth: 1, step_index: 23, capture_count: 4294967295
Process 1444832 stopped
* thread #2, name = 'tests::highligh', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x10)
frame #0: 0x0000555555fc79d2 tree_sitter_cli-b38d4d9217d5d765`ts_query_cursor__compare_captures(self=0x00007ffff0028620, left_state=0x00007ffff05a1f50, right_state=0x00007ffff05a1f60, left_contains_right=0x00007ffff7bfc680, right_contains_left=0x00007ffff7bfc688) at query.c:3379:41
3376 if (j < right_captures->size) {
3377 TSQueryCapture *left = array_get(left_captures, i);
3378 TSQueryCapture *right = array_get(right_captures, j);
-> 3379 if (left->node.id == right->node.id && left->index == right->index) {
3380 i++;
3381 j++;
3382 } else {
(lldb) bt
* thread #2, name = 'tests::highligh', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x10)
* frame #0: 0x0000555555fc79d2 tree_sitter_cli-b38d4d9217d5d765`ts_query_cursor__compare_captures(self=0x00007ffff0028620, left_state=0x00007ffff05a1f50, right_state=0x00007ffff05a1f60, left_contains_right=0x00007ffff7bfc680, right_contains_left=0x00007ffff7bfc688) at query.c:3379:41
frame #1: 0x0000555555fca696 tree_sitter_cli-b38d4d9217d5d765`ts_query_cursor__advance(self=0x00007ffff0028620, stop_on_definite_step=true) at query.c:4111:13
frame #2: 0x0000555555fcb42b tree_sitter_cli-b38d4d9217d5d765`ts_query_cursor_next_capture(self=0x00007ffff0028620, match=0x00007ffff7bfc910, capture_index=0x00007ffff7bfc90c) at query.c:4371:8
frame #3: 0x0000555555f3c3cc tree_sitter_cli-b38d4d9217d5d765`_$LT$tree_sitter_highlight.._QueryCaptures$LT$T$C$I$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$::next::h904a06a53b1ce6d7(self=0x00007ffff002ae30) at highlight.rs:238:20
frame #4: 0x0000555555f318c7 tree_sitter_cli-b38d4d9217d5d765`core::iter::adapters::peekable::Peekable$LT$I$GT$::peek::_$u7b$$u7b$closure$u7d$$u7d$::ha0f166f2c787d5d2 at peekable.rs:218:48
frame #5: 0x0000555555f3a2f4 tree_sitter_cli-b38d4d9217d5d765`core::option::Option$LT$T$GT$::get_or_insert_with::he592c958601f9fcb(self=0x00007ffff002aec0, f={closure_env#0}<tree_sitter_highlight::_QueryCaptures<&[u8], &[u8]>> @ 0x00007ffff7bfcad0) at option.rs:1753:26
frame #6: 0x0000555555f31831 tree_sitter_cli-b38d4d9217d5d765`core::iter::adapters::peekable::Peekable$LT$I$GT$::peek::he1dcd5e3f3b74d3d(self=0x00007ffff002ae30) at peekable.rs:218:21
frame #7: 0x00005555556d383d tree_sitter_cli-b38d4d9217d5d765`_$LT$tree_sitter_highlight..HighlightIter$LT$F$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$::next::hfe2c2ded4953c2ec(self=0x00007ffff7bfdf60) at highlight.rs:1018:80
frame #8: 0x000055555580087e tree_sitter_cli-b38d4d9217d5d765`tree_sitter_cli::tests::highlight_test::to_token_vector::hcdbce8225080df3d(src=(data_ptr = "\nconst Select = struct {\n const TempSpec = struct {\n fn create(spec: TempSpec, s: *const Select) InnerError!struct { Temp, bool } {\n const cg = s.cg;\n const pt = cg.pt;\n return switch (spec.kind) {\n .imm => |imm| .{ try cg.tempInit(spec.type, .{ .immediate = @bitCast(@as(i64, imm)) }), true },\n .cc => |cc| .{ try cg.tempInit(spec.type, .{ .eflags = cc }), true },\n .ptest_cc => |cc_spec| {\n const ccs: [2]Condition = .{ .z, .c };\n const cc = ccs[cc_spec.all ^ @intFromBool(cc_spec.ref.valueOf(s).register_mask.info.inverted)];\n return .{ try cg.tempInit(spec.type, .{ .eflags = switch (cc_spec.not) {\n false => cc,\n true => cc.negate(),\n } }), true };\n },\n .ref => |ref| .{ ref.tempOf(s), false },\n .reg => |reg| .{ try cg.tempInit(spec.type, .{ .register = reg }), t"..., length = 90135), language_config=0x0000555556750328) at highlight_test.rs:1994:18
frame #9: 0x000055555580290e tree_sitter_cli-b38d4d9217d5d765`tree_sitter_cli::tests::highlight_test::test_highlighting_zig::h4fd1dbc397843169 at highlight_test.rs:1340:5
frame #10: 0x0000555555732657 tree_sitter_cli-b38d4d9217d5d765`tree_sitter_cli::tests::highlight_test::test_highlighting_zig::_$u7b$$u7b$closure$u7d$$u7d$::hf4b1d888115b0ec5((null)=0x00007ffff7bfe436) at highlight_test.rs:119:27
frame #11: 0x00005555556a1ec6 tree_sitter_cli-b38d4d9217d5d765`core::ops::function::FnOnce::call_once::hf6cdc9c987015f33((null)={closure_env#0} @ 0x00007ffff7bfe436, (null)=<unavailable>) at function.rs:250:5
frame #12: 0x0000555555967d0b tree_sitter_cli-b38d4d9217d5d765`test::__rust_begin_short_backtrace::h38ffe622e17a164d + 11
frame #13: 0x0000555555984c22 tree_sitter_cli-b38d4d9217d5d765`test::types::RunnableTest::run::h84bd14d098b23d4e + 50
frame #14: 0x000055555596727d tree_sitter_cli-b38d4d9217d5d765`test::run_test::_$u7b$$u7b$closure$u7d$$u7d$::he62792e50b2840a6 + 1917
frame #15: 0x000055555595deb4 tree_sitter_cli-b38d4d9217d5d765`std::sys::backtrace::__rust_begin_short_backtrace::h0badc65700f2a8a4 + 148
frame #16: 0x000055555598ab28 tree_sitter_cli-b38d4d9217d5d765`core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h8302f868ed379776 + 168
frame #17: 0x00005555562f46a7 tree_sitter_cli-b38d4d9217d5d765`std::sys::pal::unix::thread::Thread::new::thread_start::h7afb2fc4e6f2f483 + 55
frame #18: 0x00007ffff7c9a97a libc.so.6`start_thread + 682
frame #19: 0x00007ffff7d22d2c libc.so.6`__clone3 + 44
(lldb) p right_captures->size
(const uint32_t) 4294967295
(lldb) p right_captures->contents
(TSQueryCapture *const) NULL
I noticed that either changing max_capture_list_count back to NONE or essentially just changing all uint16s to uint32s seems to fix the crash but I'm not sure if these are the correct fixes.
Steps to reproduce
git clone --depth 1 https://github.com/tree-sitter-grammars/tree-sitter-zig
cd tree-sitter-zig/
tree-sitter generate
wget https://gist.githubusercontent.com/rmehri01/73e9b6422603e18bbf47743925481088/raw/9d20dd5c39ac2430e461668ed022b240e3d7f6de/foo.zig
tree-sitter highlight foo.zig
Output:
fish: Job 1, 'tree-sitter highlight foo.zig' terminated by signal SIGSEGV (Address boundary error)
Expected behavior
It should not crash.
Tree-sitter version (tree-sitter --version)
tree-sitter 0.26.0 (605e580)
Operating system/version
Linux 6.17.4 x86_64
Problem
It seems like tree-sitter highlighting crashes when trying to highlight a large Zig file both on GitHub and in my editor. I don't think this is specific to Zig based on the below results but the highlighting does seem a bit slow and I couldn't reproduce this with other languages, so it could be that the Zig highlighting queries are the issue.
I tried to reduce this file as much as I could in the reproduction to about 1200 lines but it seems like getting rid of any more gets the segfault to go away. I tried debugging in LLDB and I'm not too familiar with the code but it seems like
rightints_query_cursor__compare_capturesisNULLperhaps because of some boundary case withNONE/UINT32_MAX?I noticed that either changing
max_capture_list_countback toNONEor essentially just changing alluint16s touint32s seems to fix the crash but I'm not sure if these are the correct fixes.Steps to reproduce
git clone --depth 1 https://github.com/tree-sitter-grammars/tree-sitter-zig cd tree-sitter-zig/ tree-sitter generate wget https://gist.githubusercontent.com/rmehri01/73e9b6422603e18bbf47743925481088/raw/9d20dd5c39ac2430e461668ed022b240e3d7f6de/foo.zig tree-sitter highlight foo.zigOutput:
fish: Job 1, 'tree-sitter highlight foo.zig' terminated by signal SIGSEGV (Address boundary error)Expected behavior
It should not crash.
Tree-sitter version (tree-sitter --version)
tree-sitter 0.26.0 (605e580)
Operating system/version
Linux 6.17.4 x86_64