Skip to content

fix(github): add workaround for duplicate comment creation#2457

Merged
chmouel merged 1 commit intotektoncd:mainfrom
theakshaypant:hack-duplicate-comments-creation-e2e
Feb 7, 2026
Merged

fix(github): add workaround for duplicate comment creation#2457
chmouel merged 1 commit intotektoncd:mainfrom
theakshaypant:hack-duplicate-comments-creation-e2e

Conversation

@theakshaypant
Copy link
Copy Markdown
Member

📝 Description of the Change

HACK: Add random sleep and re-check before creating comments. This mitigates duplicate comments seen in E2E tests where two identical comments appear despite only one API call being logged.

👨🏻‍ Linked Jira

N/A

🔗 Linked GitHub Issue

N/A

🚀 Type of Change

  • 🐛 Bug fix (fix:) - HACKY FIX
  • ✨ New feature (feat:)
  • 💥 Breaking change (feat!:, fix!:)
  • 📚 Documentation update (docs:)
  • ⚙️ Chore (chore:)
  • 💅 Refactor (refactor:)
  • 🔧 Enhancement (enhance:)
  • 📦 Dependency update (deps:)

🧪 Testing Strategy

  • Unit tests
  • Integration tests
  • End-to-end tests
  • Manual testing
  • Not Applicable

🤖 AI Assistance

  • I have not used any AI assistance for this PR.
  • I have used AI assistance for this PR.

If you have used AI assistance, please provide the following details:

Which LLM was used?

  • GitHub Copilot
  • ChatGPT (OpenAI)
  • Claude (Anthropic)
  • Cursor
  • Gemini (Google)
  • Other: ____________

Extent of AI Assistance:

  • Documentation and research only
  • Unit tests or E2E tests only
  • Code generation (parts of the code)
  • Full code generation (most of the PR)
  • PR description and comments
  • Commit message(s)

Important

If the majority of the code in this PR was generated by an AI, please add a Co-authored-by trailer to your commit message.
For example:

Co-authored-by: Gemini gemini@google.com
Co-authored-by: ChatGPT noreply@chatgpt.com
Co-authored-by: Claude noreply@anthropic.com
Co-authored-by: Cursor noreply@cursor.com
Co-authored-by: Copilot Copilot@users.noreply.github.com

**💡You can use the script ./hack/add-llm-coauthor.sh to automatically add
these co-author trailers to your commits.

✅ Submitter Checklist

  • 📝 My commit messages are clear, informative, and follow the project's How to write a git commit message guide. The Gitlint linter ensures in CI it's properly validated
  • ✨ I have ensured my commit message prefix (e.g., fix:, feat:) matches the "Type of Change" I selected above.
  • ♽ I have run make test and make lint locally to check for and fix any
    issues. For an efficient workflow, I have considered installing
    pre-commit and running pre-commit install to
    automate these checks.
  • 📖 I have added or updated documentation for any user-facing changes.
  • 🧪 I have added sufficient unit tests for my code changes.
  • 🎁 I have added end-to-end tests where feasible. See README for more details.
  • 🔎 I have addressed any CI test flakiness or provided a clear reason to bypass it.
  • If adding a provider feature, I have filled in the following and updated the provider documentation:
    • GitHub App
    • GitHub Webhook
    • Gitea/Forgejo
    • GitLab
    • Bitbucket Cloud
    • Bitbucket Data Center

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @theakshaypant, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where duplicate comments were occasionally appearing on GitHub Pull Requests, especially during end-to-end tests, despite only a single API call being logged. The solution implements a defensive strategy by introducing a brief, random delay and re-verifying the absence of an existing comment before proceeding with comment creation, thereby reducing the likelihood of race conditions leading to duplicates.

Highlights

  • Duplicate Comment Workaround: Introduced a 'hacky' workaround to prevent duplicate GitHub comments from being created, particularly observed in E2E tests.
  • Comment Re-check Mechanism: Implemented a mechanism that involves a random sleep (0-500ms) and a subsequent re-check for existing comments before creating a new one. This aims to mitigate race conditions where multiple processes might attempt to create the same comment.
  • New Helper Function: Added a new helper function, listAndFindComment, to efficiently list comments on a Pull Request and find one matching a specific marker.
Changelog
  • pkg/provider/github/github.go
    • Imported the math/rand package to support random sleep durations.
    • Added a new private method listAndFindComment to encapsulate the logic for listing PR comments and finding a specific one by a marker.
    • Refactored the CreateComment method to utilize listAndFindComment for initial comment existence checks.
    • Introduced a random sleep of up to 500 milliseconds and a second call to listAndFindComment within CreateComment as a workaround for duplicate comment creation.
Activity
  • The pull request is categorized as a bug fix (fix:).
  • The author utilized AI assistance (Cursor) for generating the PR description and comments.
  • The author confirmed that commit messages are clear and follow project guidelines, and that make test and make lint were run locally.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@pipelines-as-code
Copy link
Copy Markdown

pipelines-as-code bot commented Feb 6, 2026

🔍 PR Lint Feedback

Note: This automated check helps ensure your PR follows our contribution guidelines.

⚠️ Items that need attention:

🎫 Jira reference

Add a Jira reference in the description using one of the following formats:

  • https://issues.redhat.com/browse/SRVKP-<number>

If no SRVKP ticket exists yet, link a GitHub issue instead (e.g., Fixes #123).
Minor housekeeping PRs without Jira coverage can skip this after confirming with reviewers.


ℹ️ Next Steps

  • Review and address the items above
  • Push new commits to update this PR
  • This comment will be automatically updated when issues are resolved
🔧 Admin Tools (click to expand)

Automated Issue/Ticket Creation:

  • /issue-create - Generate a GitHub issue from this PR content using AI
  • /jira-create - Create a SRVKP Jira ticket from this PR content using AI

⚠️ Important: Always review and edit generated content before finalizing tickets/issues.
The AI-generated content should be used as a starting point and may need adjustments.

These commands are available to maintainers and will post the generated content as PR comments for review.

🤖 This feedback was generated automatically by the PR CI system

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a workaround for a race condition causing duplicate comments by adding a random sleep and extracting comment lookup logic into listAndFindComment. While addressing the race condition, the listAndFindComment function has security vulnerabilities due to lack of author verification and functional limitations related to pagination, which can lead to comment hijacking, PR flooding, and API exhaustion. Furthermore, for Go versions older than 1.20, math/rand should be seeded at application startup for effective random sleep.

Comment on lines +768 to +772
for _, comment := range comments {
if re.MatchString(comment.GetBody()) {
return comment, nil
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The listAndFindComment function identifies comments to update solely based on a regex match of a marker string in the comment body. It does not verify that the comment was authored by the Pipelines as Code (PAC) GitHub App itself. An attacker can post a comment containing the marker string, causing PAC to identify that comment as its own and overwrite it, allowing unauthorized modification and potential hijacking of the PAC status display. Furthermore, this function only fetches the first 100 comments. For active PRs, this can lead to failing to find recent comments, causing duplicate comments, PR flooding, and API exhaustion. It is crucial to implement author verification and proper pagination to address these security and functional issues.

func (v *Provider) listAndFindComment(ctx context.Context, event *info.Event, marker string) (*github.IssueComment, error) {
	re := regexp.MustCompile(regexp.QuoteMeta(marker))
	opts := &github.IssueListCommentsOptions{
		ListOptions: github.ListOptions{PerPage: 100},
	}

	for {
		comments, resp, err := wrapAPI(v, "list_comments", func() ([]*github.IssueComment, *github.Response, error) {
			return v.Client().Issues.ListComments(ctx, event.Organization, event.Repository, event.PullRequestNumber, opts)
		})
		if err != nil {
			return nil, err
		}

		for _, comment := range comments {
			if re.MatchString(comment.GetBody()) {
				return comment, nil
			}
		}

		if resp.NextPage == 0 {
			break
		}
		opts.Page = resp.NextPage
	}

	return nil, nil
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point lol, we should probably do this across every place where we do comments...

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Implementation 🛠️

Implementation: Tighten comment selection to only consider comments authored by a bot when running as a GitHub App, and prefer the most recent matching comment among the fetched comments to reduce hijacking risk and avoid updating stale matches.

Suggested change
for _, comment := range comments {
if re.MatchString(comment.GetBody()) {
return comment, nil
}
}
var matched *github.IssueComment
for _, comment := range comments {
user := comment.GetUser()
if user == nil {
continue
}
// If we're running as a GitHub App, only consider bot-authored comments
// to avoid matching/updating attacker-controlled comments containing the marker.
if event.InstallationID != 0 && strings.ToLower(user.GetType()) != "bot" {
continue
}
if re.MatchString(comment.GetBody()) {
// Prefer the most recent match among the fetched comments.
if matched == nil || comment.GetCreatedAt().After(matched.GetCreatedAt()) {
matched = comment
}
}
}
return matched, nil

See review comment here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's actually a good suggestions qodot

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chmouel qodo suggestion makes sense.
As for the gemini suggestion, shouldn't the edit_comment get a 403 if it tries to edit someone else's comment? Github Docs states that you need write or higher permissions on the repo. That is something we do not give (or recommend) for the app or webhook token, right?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ask

@chmouel qodo suggestion makes sense. As for the gemini suggestion, shouldn't the edit_comment get a 403 if it tries to edit someone else's comment? Github Docs states that you need write or higher permissions on the repo. That is something we do not give (or recommend) for the app or webhook token, right?

Answer:

GitHub won’t return 403 just because the comment was authored by someone else.

The 403 you’re thinking of is about authorization scope, not comment ownership. If the credential you’re using has the ability to write issue comments on that repo, then the PATCH /repos/{owner}/{repo}/issues/comments/{comment_id} (“edit comment”) endpoint generally allows editing any issue/PR comment in that repo, regardless of who created it.

Concretely for PaC:

  • To create PR comments, PaC (GitHub App installation token) typically needs Issues: write (and/or Pull requests: write depending on API usage). With that, it can create and edit issue comments in the repo.
  • Repo “write” is not the only way to get this; a GitHub App permission like Issues: write is sufficient to call these APIs. So even if we “don’t recommend write/admin on the repo”, the app still commonly has the relevant write permission to issue comments.
  • Therefore, if an attacker posts a comment containing the marker, listAndFindComment may select it and PaC may successfully edit it—no 403—leading to the hijacking scenario the bots flagged (overwriting attacker-controlled comment content or causing confusing status display).

If, in some deployments, the token doesn’t have permission to edit comments, then yes, EditComment would fail (likely 403), but that’s not a safe assumption:

  • it would turn the attack into a DoS/flakiness vector (PaC errors trying to edit), and
  • it doesn’t address the main risk for the common configuration where PaC does have Issues: write.

So the qodo suggestion (filter to bot/app-authored comments when running as a GitHub App, and prefer the newest match) is a good mitigation, and gemini’s pagination point is also valid (only checking first 100 comments can miss the intended marker and create duplicates).

@theakshaypant theakshaypant marked this pull request as ready for review February 6, 2026 10:37
// comments. This reduces the window where parallel processes might both
// see no existing comment and both decide to create one.
//nolint:gosec // No need for crypto/rand here, just reducing timing window
time.Sleep(time.Duration(rand.Intn(500)) * time.Millisecond)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should do context aware sleep instead, i.e:

Suggested change
time.Sleep(time.Duration(rand.Intn(500)) * time.Millisecond)
jitter := time.Duration(rand.Intn(500)) * time.Millisecond
timer := time.NewTimer(jitter)
defer timer.Stop()
select {
case <-ctx.Done():
return ctx.Err()
case <-timer.C:
}

HACK: Add random sleep and re-check before creating comments.
This mitigates duplicate comments seen in E2E tests where two
identical comments appear despite only one API call being logged.

Signed-off-by: Akshay Pant <akshay.akshaypant@gmail.com>
Assisted-by: Claude Opus 4.5 (via Cursor)
@theakshaypant theakshaypant force-pushed the hack-duplicate-comments-creation-e2e branch from 8ba6005 to 2743e60 Compare February 7, 2026 03:40
@chmouel chmouel merged commit 017ebeb into tektoncd:main Feb 7, 2026
10 checks passed
@theakshaypant theakshaypant deleted the hack-duplicate-comments-creation-e2e branch February 8, 2026 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants