fix(lib): address strict aliasing violations with Array type#5242
fix(lib): address strict aliasing violations with Array type#5242WillLillis merged 2 commits intotree-sitter:masterfrom
Array type#5242Conversation
40d79d3 to
e9b1de7
Compare
|
I believe the CI failures are now due to this line in tree-sitter-php's scanner: https://github.com/tree-sitter/tree-sitter-php/blob/master/common/scanner.h#L468 (Here's the snippet since GH won't render it inline) array_delete(&array_pop(&scanner->heredocs).word);Taking the address of a non-lvalue is undefined behavior, and this seems to be it. I've opened a PR to fix the issue here (tree-sitter/tree-sitter-php#293), so once that's merged and a new release is published, we can update the tag in |
|
Once the php scanner is fixed, I'd like to run a crater test with Neovim built against this PR. (I'm sure that's not the only parser with... questionable behavior out there...) |
e9b1de7 to
f452d38
Compare
Set tree-sitter-php fixture to `upstream_test_fixture` branch. Also remove the `--update` flag, as it does nothing.
Altering the `Array` type itself isn't feasible, as this causes unacceptable breakage with existing parsers that depend on it. Instead, pass in individual `Array` fields for to various `_array__*` functions. Any time the `contents` of an array may be modified (`free`d, `realloc`d, etc), return the potentially new address out by value. This prevents any strict aliasing violations as we're no longer writing to a type-casted pointer. Co-authored-by: Nathaniel Wesley Filardo <nwfilardo@gmail.com>
f452d38 to
68ec705
Compare
|
I've added 29281b1, which allows CC @clason I think this is a better solution than disabling the PHP tests, especially given how hard to predict/reason about some of these crashes are. |
|
I am not entirely sure that this will close #4960, though it is surely a huge improvement, since, as I noted over in #4960 (comment), there is also some strict-aliasing violation going on in |
I was planning on extracting the violations in |
|
Sounds like a plan! Thank you! |
|
I've gone through most of the grammars with external scanners in nvim-treesitter to check for further fallout from this change. Besides tree-sitter-php, The following repos crashed when regenerated and tested: |
|
also
All three PRs are now merged, so this is good to go? |
|
Successfully created backport PR for |
|
Belatedly (sorry!), can say that |
That shouldn't be relevant to these changes, but I'm not sure why you'd get that error. Have you |
|
Following up on this, the fixture corpus tests are now passing for me in a ppcle64 ubuntu docker container. I have ran into the same missing symbol issue raised by @nwf above. The problem seems to be here: tree-sitter/crates/loader/src/loader.rs Lines 1298 to 1302 in 4f8983b For some background, on non-Windows platforms we use the system installed https://lists.gnu.org/archive/html/bug-binutils/2021-11/msg00024.html |
Altering the
Arraytype itself isn't feasible, as this causes unacceptable breakage with existing parsers that depend on it. Instead, pass in individualArrayfields for to various_array__*functions.Any time the
contentsof an array may be modified (freed,reallocd, etc), return the potentially new address out by value. This prevents any strict aliasing violations as we're no longer writing to a type-casted pointer.Note that the
array.hincluded in a grammar'ssrc/tree-sitter/directory is updated upon runningtree-sitter generate.This is a rework of #5194, thanks @nwf for taking an initial pass and discussing solutions!