Improve handling of patterns and AST_to_IL translation#483
Merged
dimitris-m merged 9 commits intomainfrom Dec 9, 2025
Merged
Conversation
These parameters are introduced in encodings (Elixir, OCaml) and they are not expected to be found in the source files. However, we were looking for them as if they were normal identifiers, and the rules containing them were never applicable... hence no findings. From now on, it's important to only introduce such special identifiers that begin with the above prefix, so that they are properly excluded.
- Makes `[... $P ...]` work as expected, previously patterns needed to match exactly in length and ellipsis did not work as one would assume. The same applies to tuples. - A lot of AST_to_IL changes: - Ensures that destructuring values in Switch creates new variables like LetPattern, since a lot of encodings won't work for tainting without this modification (Elixir, OCaml, and more which are upcoming...). - Expands the patterns that can be encoded into IL so that tainting works in more cases (PatAs, PatWhen, and more). - Ensures that the custom pattern used in Elixir is processed in a better way, so that a lot of cases can now work (multiple destructuring branches in `fn` now work in tainting rules). - Ensures that `PatAs(pat, x)` introduces as new variable `x` as expected, intuitively.
This is in semgrep-rules, and we don't touch the submodule as a general principle.
- uses PatWhen; - has better support for focus-metavariable that can capture taints in guard conditions.
This is no longer needed (see previous commit).
maciejpirog
reviewed
Dec 9, 2025
| * ignored, since they are not expected to be found in the source file. *) | ||
| let implicit_param = "!!_implicit_param!" | ||
| let implicit_param_id tk = (implicit_param, tk) | ||
| let is_implicit_param s = String.starts_with ~prefix:"!!_implicit_param!" s |
Contributor
There was a problem hiding this comment.
Why can't we use an ID flag hidden for this? The comment in https://github.com/opengrep/opengrep/blob/main/libs/ast_generic/IdFlags.mli#L10 suggests this is exactly its purpose. The way it is done now, we have two mechanisms for the same thing, or am I confusing the use-cases for the hidden flag?
Collaborator
Author
There was a problem hiding this comment.
I see the TODO in Pattern.ml... maybe we can do this on a next PR?
maciejpirog
reviewed
Dec 9, 2025
src/analyzing/AST_to_IL.ml
Outdated
| * this kind of pattern. But I wonder if it would be pointless | ||
| * to bother with it. *) | ||
| (env, lval, pat_stmts @ guard_stmts) | ||
| (* XXX: This is for Elixir lambdas, and it's incomplete... |
Collaborator
Author
There was a problem hiding this comment.
This was my previous attempt :)
maciejpirog
approved these changes
Dec 9, 2025
Contributor
|
Do we know if this is related to #452 ? |
tmeijn
pushed a commit
to tmeijn/dotfiles
that referenced
this pull request
Dec 18, 2025
This MR contains the following updates: | Package | Update | Change | |---|---|---| | [opengrep/opengrep](https://github.com/opengrep/opengrep) | minor | `v1.12.1` -> `v1.13.2` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>opengrep/opengrep (opengrep/opengrep)</summary> ### [`v1.13.2`](https://github.com/opengrep/opengrep/releases/tag/v1.13.2): Opengrep 1.13.2 [Compare Source](opengrep/opengrep@v1.13.1...v1.13.2) #### Improvements - C#: Add matching on function argument modifiers (ref, in, scoped, etc.) by [@​maciejpirog](https://github.com/maciejpirog) in [#​494](opengrep/opengrep#494) - C#: Support extension blocks by [@​maciejpirog](https://github.com/maciejpirog) in [#​496](opengrep/opengrep#496) #### Release process - Validate tag input on release by [@​lae](https://github.com/lae) in [#​493](opengrep/opengrep#493) **Full Changelog**: <opengrep/opengrep@v1.13.1...v1.13.2> ### [`v1.13.1`](https://github.com/opengrep/opengrep/releases/tag/v1.13.1): Opengrep 1.13.1 [Compare Source](opengrep/opengrep@v1.12.1...v1.13.1) #### Improvements - Improve handling of patterns and `AST_to_IL` translation by [@​dimitris-m](https://github.com/dimitris-m) in [#​483](opengrep/opengrep#483) - Improve rust tainting by [@​dimitris-m](https://github.com/dimitris-m) in [#​485](opengrep/opengrep#485) - Dump generic AST to HTML by [@​maciejpirog](https://github.com/maciejpirog) in [#​484](opengrep/opengrep#484) - Modernise C# by [@​maciejpirog](https://github.com/maciejpirog) in [#​487](opengrep/opengrep#487) #### Bug fixes - Fix for kotlin double-annotation bug by [@​maciejpirog](https://github.com/maciejpirog) in [#​480](opengrep/opengrep#480) - Fix PCRE2 test making OSX build fail by [@​dimitris-m](https://github.com/dimitris-m) in [#​486](opengrep/opengrep#486) - Fix: in `LetPattern(pat, e)`, `e` should be visited first by [@​dimitris-m](https://github.com/dimitris-m) in [#​488](opengrep/opengrep#488) #### CI fixes - Force python 3.13 for osx binary workflow by [@​dimitris-m](https://github.com/dimitris-m) in [#​490](opengrep/opengrep#490) ##### Notes - Version 1.13.0 ([#​489](opengrep/opengrep#489)) intentionally skipped due to CI errors, fixed in [#​490](opengrep/opengrep#490). **Full Changelog**: <opengrep/opengrep@v1.12.1...v1.13.1> </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0Mi41Ny4xIiwidXBkYXRlZEluVmVyIjoiNDIuNTcuMSIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiUmVub3ZhdGUgQm90Il19-->
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes an number of issues, and improves matching and tainting:
In switch / match statements containing a list pattern of the shape
[..., $P, ...]matching did not work as expected, specifically...was ignored and it would only match lists of length 3. The same applies for tuples.Patterns in switches and lambdas with destructuring (example: Elixir) did not introduce proper variables, so e.g. the Elixir lambda
fn x -> sink(x)which is internally translated into a switch with cases did not report taint. This is now fixed in the IL translation. Almost all kinds of patterns are now translated and taint propagates through nested destructuring patterns. See 0850936.Fixed a bug that caused stuff like Elixir lambdas and OCaml
functions to not match at all for some rules. See e520986.Improved matching so that e.g. we can create a search rule that matches the identity function or more generally where e.g. a parameter is returned. See cf44d63.
Changed the translation of Elixir lambdas so it uses better supported features, resulting in more fine-grained control of taint in guard conditions (previous translation was failing to distinguish that the last lambda in our test is actually ok, now this is captured).
See new and adapted files under
tests/