Skip to content

Fixed top level HOFS and inlined lambda#513

Merged
maciejpirog merged 2 commits intomainfrom
HOF_fixes
Dec 30, 2025
Merged

Fixed top level HOFS and inlined lambda#513
maciejpirog merged 2 commits intomainfrom
HOF_fixes

Conversation

@corneliuhoffman
Copy link
Contributor

Two issues with HOF taint analysis were not working:

  1. Inline lambda calls: Calling a lambda immediately after definition wasn't tracked
 (lambda x: sink(x))(source())  # Not detected
  1. Top-level HOF callbacks: Lambdas/functions defined at module level and passed to HOFs weren't being analyzed
toplevel_sink = ->(x) { sink(x) }
toplevel_items.each(&toplevel_sink)  # Not detected

Solution

Inline lambda calls (Dataflow_tainting.ml)

  • Detect lambda expressions in callee position and analyze them inline with tainted arguments

Top-level HOF callbacks (Function_call_graph.ml, Visit_function_defs.ml)

  • Add visit_with_parent_path to track parent function context during AST traversal
  • Extract HOF callbacks from top-level code (outside function bodies)
  • Add edges from top-level lambdas to <top_level> node in call graph

Ruby/PHP block parameters (AST_to_IL.ml)

  • Handle &callback syntax (OtherParam("Ref", ...)) as proper parameters for taint tracking

Builtin models (Builtin_models.ml)

  • Add MethodHOF for Ruby to handle .each(&callback) pattern

Code cleanup

  • Simplified constructor implicit edge logic (removed unnecessary exclusion)
  • Removed unsafe fallback matching ANY class method by name
  • Commented out debug dot file generation

Test plan

  • All HOF comprehensive tests pass (Ruby, Scala, Python, JS, etc.)
  • New test: tests/tainting/lambda_inline_call.py
  • New test: tests/tainting/toplevel_hof_callback.py

Corneliu Hoffman added 2 commits December 30, 2025 18:59
A bug was introduced in using AST_toIL.name_of_entity. THat forces some
sid's and naming that messes up the equalities in func_id. this fixes
that
Comment on lines +208 to +209
let class_il = Option.bind !current_class g_name_to_il_name in
let func_il = entity_to_il_name ent in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But AST_to_IL.var_of_name also deals with IdQualified.

Is there no reason to deal with that in g_name_to_il_name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ampretty sure not (we only use it for classes and methods)

(* Visit all Call expressions at top level (outside function bodies).
Callback receives: full call expr, callee expr, and arguments *)
let visit_toplevel_calls (f : G.expr -> G.expr -> G.arguments -> unit) (ast : G.program) : unit =
let v = new toplevel_call_visitor in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to create a new instance on each call?

Why not share the visitor (pull out of function)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ... that answers my question in your PR :)))
My point is that I wanted the function to be kindda pure ... despirte the visitor mess inside. And that way I don't have to worry too much about threards

let fold_toplevel_calls (f : 'acc -> G.expr -> G.expr -> G.arguments -> 'acc)
(init_acc : 'acc) (ast : G.program) : 'acc =
let acc_ref = ref init_acc in
let v = new toplevel_call_visitor in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, visitor instance can be shared.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above, can change but did not want to think about threads

Comment on lines +671 to +679
(match e.G.e with
| G.Call (callee, args) ->
let found = extract_hof_callbacks_from_call
~method_hofs ~function_hofs ~user_hofs ~all_funcs ~caller_parent_path
callee args
in
callbacks := found @ !callbacks
| _ -> ());
super#visit_expr env e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving as TODO to extract this into a class, there are some potential concurrency issues with objects + domains.
It's one of the reasons I converted most (all?) such objects into classes and had the instances stateless & shared.

Copy link
Collaborator

@dimitris-m dimitris-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Some small comments you may consider.

We should close #500 once merged.

@maciejpirog maciejpirog merged commit 920e6c2 into main Dec 30, 2025
6 checks passed
@maciejpirog maciejpirog deleted the HOF_fixes branch December 30, 2025 23:05
@maciejpirog maciejpirog mentioned this pull request Dec 31, 2025
tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Jan 9, 2026
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [opengrep/opengrep](https://github.com/opengrep/opengrep) | minor | `v1.13.2` → `v1.14.1` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>opengrep/opengrep (opengrep/opengrep)</summary>

### [`v1.14.1`](https://github.com/opengrep/opengrep/releases/tag/v1.14.1): Opengrep 1.14.1

[Compare Source](opengrep/opengrep@v1.14.0...v1.14.1)

#### Improvements

- Clojure translation part II by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;517](opengrep/opengrep#517)
- C#: Allow implicit variables in properties to be taint sources by [@&#8203;maciejpirog](https://github.com/maciejpirog) in [#&#8203;516](opengrep/opengrep#516)
- Add core flags `dump_rule` and `dump_patterns_of_rule` as options in the show command by [@&#8203;maciejpirog](https://github.com/maciejpirog) in [#&#8203;519](opengrep/opengrep#519)

#### Bug fixes

- Fix: pass signature databaseb to lambda analysis, handle method mutation tainting by [@&#8203;corneliuhoffman](https://github.com/corneliuhoffman) in [#&#8203;520](opengrep/opengrep#520)

#### Tech debt

- Fix CHANGELOG.md, OPENGREP.md, remove unused files by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;523](opengrep/opengrep#523)

**Full Changelog**: <opengrep/opengrep@v1.14.0...v1.14.1>

### [`v1.14.0`](https://github.com/opengrep/opengrep/releases/tag/v1.14.0): Opengrep 1.14.0

[Compare Source](opengrep/opengrep@v1.13.2...v1.14.0)

#### Improvements

- Support for higher-order functions in intrafile taint analysis by [@&#8203;corneliuhoffman](https://github.com/corneliuhoffman) in [#&#8203;469](opengrep/opengrep#469) and [#&#8203;513](opengrep/opengrep#513)
- Clojure: Improved support for Clojure (incl. tainting) by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;501](opengrep/opengrep#501)
- Dart: Improved support for Dart by [@&#8203;maciejpirog](https://github.com/maciejpirog) in [#&#8203;508](opengrep/opengrep#508)
- C#: Better handing of extension methods and extension blocks by [@&#8203;maciejpirog](https://github.com/maciejpirog) in [#&#8203;514](opengrep/opengrep#514)

#### Fixes

- Bump cygwin install action by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;503](opengrep/opengrep#503) and [#&#8203;509](opengrep/opengrep#509)

**Full Changelog**: <opengrep/opengrep@v1.13.2...v1.14.0>

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0Mi42OS4yIiwidXBkYXRlZEluVmVyIjoiNDIuNjkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiUmVub3ZhdGUgQm90IiwiYXV0b21hdGlvbjpib3QtYXV0aG9yZWQiLCJkZXBlbmRlbmN5LXR5cGU6Om1pbm9yIl19-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants