Skip to content

dockerfile: fix CRLF and comment-in-continuation parsing#586

Merged
maciejpirog merged 2 commits intoopengrep:mainfrom
abezdina:da/dockerfile-fix-crlf-and-comment-parsing
Feb 17, 2026
Merged

dockerfile: fix CRLF and comment-in-continuation parsing#586
maciejpirog merged 2 commits intoopengrep:mainfrom
abezdina:da/dockerfile-fix-crlf-and-comment-parsing

Conversation

@abezdina
Copy link

Summary

Fixes two Dockerfile parsing bugs that caused false positives (e.g., false "last user is root" findings):

  • CRLF line endings: Added normalize_line_endings to strip \r before parsing, since tree-sitter's line_continuation doesn't match \r\n.
  • Comments in LABEL continuations: Changed the comment regex from /#./ to /#[^\\\n](?:\.[^\\\n])/ so trailing backslashes aren't swallowed by comments

Test plan

  • Added parsing test: tests/parsing/dockerfile/label-continuation-comment.dockerfile
  • Added pattern test: tests/patterns/dockerfile/label-continuation-comment.{dockerfile,sgrep}
  • Updated tests/patterns/dockerfile/multiline_comment.dockerfile
  • All existing tests pass, make core builds

@maciejpirog
Copy link
Contributor

CI returned this:

  Error: fatal: remote error: upload-pack: not our ref a5babc9ee9653a4bd5236f07c404ced41f1a0860
  Error: fatal: Fetched in submodule path 'languages/dockerfile/tree-sitter/semgrep-dockerfile', but it did not contain a5babc9ee9653a4bd5236f07c404ced41f1a0860. Direct fetching of that commit failed.

Probably this commit was not pushed to the dockerfile submodule.

@abezdina can you first open a relevant PR in https://github.com/opengrep/semgrep-dockerfile (merging to opengrep/main)?

@abezdina
Copy link
Author

abezdina commented Feb 13, 2026

@maciejpirog opened a PR opengrep/semgrep-dockerfile#2
Can you please validate when you have time if it was done correctly

@dimitris-m
Copy link
Collaborator

opengrep/semgrep-dockerfile#2

you need to update the submodule commit to point to your semgrep-dockerfile branch (for now) and then if the checks turn green here, we can proceed.

the process is this: if this PR is approved, we will merge the semgrep-dockerfile branch into opengrep/main, then update the submodule commit once more here to point to opengrep/main, and merge this one.

once CI is green here we will review your PR, thanks!

@dimitris-m
Copy link
Collaborator

Also, please rebase your commits to remove this part:

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

No problem to use any AI tool of course, but I prefer not to have it as formal "co-author" in our codebase: the PR submitter takes full credit and bears full responsibility for work submitted.

@abezdina abezdina force-pushed the da/dockerfile-fix-crlf-and-comment-parsing branch from 980396b to cba20d8 Compare February 14, 2026 08:39
@abezdina
Copy link
Author

@dimitris-m you are totally right, my bad, fixed prs comments

@maciejpirog
Copy link
Contributor

LGTM after the changes.

I merged the PR in the dockerfile repo. @abezdina can you now update this PR so that the dockerfile submodule points to the latest opengrep/main in the dockerfile repo (commit b149e3a17fb827ecef64e1b6c2b64ae2cd2429e4), and I will merge this PR?

(I understand that the process of updating parsers is a bit annoying, we have making it smoother on our TODO list)

@maciejpirog
Copy link
Contributor

@abezdina ...and rebase, because there were some pushes to main in between

@abezdina abezdina force-pushed the da/dockerfile-fix-crlf-and-comment-parsing branch from cba20d8 to e6f3b88 Compare February 17, 2026 12:55
@abezdina
Copy link
Author

@maciejpirog done!

Strip \r characters before passing Dockerfile content to tree-sitter,
since CRLF line endings break the line_continuation regex pattern.
Change the comment regex from /#.*/ to /#[^\\\n]*(?:\\.[^\\\n]*)*/
so that a trailing backslash on a comment line is NOT consumed as part
of the comment token.
@maciejpirog maciejpirog force-pushed the da/dockerfile-fix-crlf-and-comment-parsing branch from e6f3b88 to 8a390d5 Compare February 17, 2026 16:00
@maciejpirog maciejpirog merged commit a28cd53 into opengrep:main Feb 17, 2026
6 checks passed
@maciejpirog maciejpirog mentioned this pull request Feb 17, 2026
@dimitris-m dimitris-m mentioned this pull request Feb 17, 2026
tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Feb 19, 2026
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [opengrep/opengrep](https://github.com/opengrep/opengrep) | patch | `v1.16.0` → `v1.16.1` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>opengrep/opengrep (opengrep/opengrep)</summary>

### [`v1.16.1`](https://github.com/opengrep/opengrep/releases/tag/v1.16.1): Opengrep 1.16.1

[Compare Source](opengrep/opengrep@v1.16.0...v1.16.1)

#### Improvements

- Pin Nuitka to 2.8.9 across all build workflows by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;594](opengrep/opengrep#594)
- Remove redundant pip and Nuitka dependencies by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;573](opengrep/opengrep#573)
- Support split rule/target directories in test subcommand by [@&#8203;qkaiser](https://github.com/qkaiser) in [#&#8203;576](opengrep/opengrep#576)

#### Benchmarking

- New benchmarking using hyperfine by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;557](opengrep/opengrep#557) and [#&#8203;579](opengrep/opengrep#579)

#### Bug fixes

- Allow multiple logical operators in metavariable comparison by [@&#8203;maciejpirog](https://github.com/maciejpirog) in [#&#8203;590](opengrep/opengrep#590)
- In `--experimental`, don't report git untracked files as skipped with `--use-git-ignore` by [@&#8203;maciejpirog](https://github.com/maciejpirog) in [#&#8203;577](opengrep/opengrep#577)
- C#: Add primary constructor arguments to base class by [@&#8203;maciejpirog](https://github.com/maciejpirog) in [#&#8203;589](opengrep/opengrep#589)
- Dockerfile: Add missing buildkit constructs by [@&#8203;maciejpirog](https://github.com/maciejpirog) in [#&#8203;581](opengrep/opengrep#581)
- Dockerfile: Fix CRLF and comment-in-continuation parsing by [@&#8203;abezdina](https://github.com/abezdina) in [#&#8203;586](opengrep/opengrep#586)
- Rust: Fix taint propagation through variable shadowing by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;572](opengrep/opengrep#572)
- TS/TSX: Add support for the `satisfies` construct by [@&#8203;maciejpirog](https://github.com/maciejpirog) in [#&#8203;592](opengrep/opengrep#592)

#### Installation

- Add Windows install script (pwsh) by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;569](opengrep/opengrep#569)
- Ensure that install.ps1 works on ARM by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;571](opengrep/opengrep#571)
- Fix: handle unparseable cosign version in install.sh by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;580](opengrep/opengrep#580)

#### Documentation

- Improve the README by [@&#8203;dimitris-m](https://github.com/dimitris-m) in [#&#8203;570](opengrep/opengrep#570)

#### New Contributors

- [@&#8203;qkaiser](https://github.com/qkaiser) made their first contribution in [#&#8203;576](opengrep/opengrep#576)
- [@&#8203;abezdina](https://github.com/abezdina) made their first contribution in [#&#8203;586](opengrep/opengrep#586)

**Full Changelog**: <opengrep/opengrep@v1.16.0...v1.16.1>

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever MR is behind base branch, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4yNC4yIiwidXBkYXRlZEluVmVyIjoiNDMuMjQuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiUmVub3ZhdGUgQm90IiwiYXV0b21hdGlvbjpib3QtYXV0aG9yZWQiLCJkZXBlbmRlbmN5LXR5cGU6OnBhdGNoIl19-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants