Skip to content

Bottylicious - automate all the things! #35490

@thaJeztah

Description

@thaJeztah

@alexellis, @tiborvass and I were recently discussing bots that help maintaining this project. I had some notes laying around (my "bots wish-list"), rephrased some of those a bit and dumped them here for inspiration, and possible implementation at some point 😄

I know some of these are provided already by bots "in the wild", so we should investigate if those are options.

Triage helper bots

lock-a-bot

Old pull requests and issues frequently get commented on ("I'm still seeing this" on a 3 year old issue, or "This new feature doesn't work on my 5 year old Docker"). To prevent this, discussions that were closed should be automatically locked after some time.

  • After an issue / PR is closed, leave a comment mentioning that the discussion on the PR will be locked after 14 days(?)
  • The 14 days "grace period" allows for relevant comments to be added (which can be a simple "thank you for implementing this!", or a "This PR broke the build!"). After that, comments are generally just "noise", and not useful.
  • After 14 days, lock the discussion, leave a comment that if people run into issues with the PR, or have the same issue, they should open a new issue/ticket (because merged/closed PR's/tickets are not actively monitoried, and comments get overlooked easily)

re-re-re-rebase-bot

GitHub checks mergeable state of pull requests, but only does so "real time" (when viewing the pull request). With a lot of activity in the repository, every merged pull-request can lead to other pull requests needing a rebase. Unfortunately, it's not possible to discover that a rebase is needed, without manually opening each pull request and watch the "mergeable stage".

The re-re-re-rebase-bot automatically checks the mergeable state of each pull request, and:

  • Adds a label "rebase needed" if the pull request is no longer mergeable (this allows filtering pull requests by state)
  • Leaves a comment, asking the contributor to rebase the pull request (possibly limited to pull requests in "status/2-code-review", "status/3-docs-review", and "status/4-merge")

me-too-me-too-bot

We used to have a bot for this, but I don't think it's enabled anymore. People like to "+1", "+1000", :+1 or "me too" issues. While it's understandable people may "have the same issue", or would kick in that they are interested in a proposed feature, those comments;

  • generate a lot of noise (over 3000 subscribers to the repository that receive an e-mail)
  • derail the discussion (long discussions on GitHub are hard to read)
  • don't help resolving issues ("+1" to what? the previous comment? the original issue? do you have more information that would help resolving the issue?)

The me-too-me-too-bot:

  • removes +1 comments
  • collects the GitHub handles of those that left a +1 comment, into a single comment, describing that the people listed left their "+1"
  • in that comment, describes the preferred way to interact on issues (use the "subscribe" button to stay informed, use "emoji's" to express your support, and if you have more information that has not been included in the discussion, or have a use-case that explains why the feature is useful: leave a comment with that information)

Review helper bots

ready-to-take-the-next-step-bot

This project uses labels to mark each stage in the review process. Going through those steps requires removing the label for the previous stage, and adding the label for the next. This can be simplified (with the added bonus that it allows setting those labels even from the mobile web-interface 👍)

The ready-to-take-the-next-step-bot allows you to move to the next stage by leaving a comment, e.g.:

Commenting "bot: move this to code review" on a PR that's in "design" review would remove the status/1-design-review label, and add the status/2-code-review label (remove previous stage, add next).

Also see the let-me-merge-that-for-you-bot for moving a pull request to status/4-merge.

tell-us-a-story-bot

Some pull requests have an exemplary description on GitHub, but turn out to have no commit message.

Commit messages are important, because the commit-message is what ends up in source-control, not what's on GitHub. However, it's easy to overlook that a commit message is missing (or doesn't match the latest code changes - this may be more dificult to address though).

The tell-us-a-story-bot should check the size of the patch, and raise a red flag if the commit message is empty (or "short" compared to the patch size), for example:

  • add a label "no commit message" or "short commit message"
  • print the commit message as a comment (for easier reviewing)

what-the-branch-bot

Pull requests should generally be opened against the default ("master") branch. It's easy to miss that a pull-request was actually opened against a non-standard branch (as reviewer, you're focussed on the diff, not what the diff was opened against).

The what-the-branch-bot helps identifying pull request that were opened against a branch other than the default, by:

  • applying a label ("non-default-branch")
  • (if not present) adding a prefix to the PR's title (e.g.[vX.Y.Z] / [branch name] prefix for release branches)

squishy-squashy-bot

Pull requests are ideally small, and should only address a single change. Overall this means that most pull requests only have a single commit.

While grouping changes into separate commits is encouraged, they are not intended to preserve history of the review process ("Fix typo", "Address review comments", "Address more review comments").

Reviewers may be focussed on the changes made in a pull request, and overlook that the PR has multiple commits, or commits are not properly "grouped".

The squishy-squashy-bot helps reviewers by;

  • adding label if multiple-commits were found ("multiple commits"? "check-if-squashing-is-needed"?)
  • detection may be configurable (setting a threshhold for "number of commits per file", "ratio LOC <--> number of commits"?)

When combining with the let-me-merge-that-for-you-bot, possibly:

  • When giving instructions to merge, ask confirmation if squashing may be needed ("This pull request consists of 10 commits. Confirm if merging should proceed")
  • Allow reviewers to do so "in one go" ("Bot: merge when green, no squashing needed")

did-you-miss-me-bot

When reviewing a pull request, reviewers tend to focus on code changes. There are other changes that can be easy to miss (due to how GitHub's UI presents them). For example:

  • check if file-permissions changed, and leave a summary comment that describes which files changed (GitHub only shows a small badge, e.g. 0600 -> 0777).
  • check if possibly unwanted files (.tmp, .bak, Thumbs.db etc) were added. This would be a great feature, as it doesn't require the .gitignore to be cluttered with these
  • check if binary file(s) were added. Binary files are "hidden" in the diff view on GitHub, and easy to miss if there's a large diff
    binary files are hidden

does-this-make-my-repo-look-big-bot

Images can tell a thousand words. Unfortunately, a single image can easily take up the space of a million words. Once merged, those bytes are in the repository forever, so it's good practice to check for this.

The does-this-make-my-repo-look-big-bot

  • checks if images that are included in a pull request can be (loslessly) optimized
  • if optimizing would save a lot in size (configurable threshold in "percentage" or "bytes"), leaves a comment with the results
  • ideally, does the actual optimization, and provides links to the optimized versions of the images

Note wondering if this is a job for a bot (commenting, and providing optimized links may be), or for CI. Left this idea here anyway 😄

Vendoring helper bots

let-me-diff-that-for-you-bot

GitHub collapses all external dependencies in the diff viewer. While this is a great thing to allow you focus on the local code-changes separate from those that were pulled in from upstreams, it also makes it impossible (or "very difficult") to verify upstream code-changes (which may be relevant).

In addition; even if the diff of external dependencies would be visible, without context, it's difficult to understand those changes.

The let-me-diff-that-for-you-bot would;

The diff link enables a reviewer to see all commits that are made in upstream dependencies, which is often more descriptive than just the code changes itself. If also allows to make notes about those changes for the changelog, if important issues were fixed, or the dependency brings changes in behavior.

inception-bot

It's standard practice to "flatten" ("strip") vendored dependencies; this means that only a top level vendor directory is present in the repository, and every package (including vendored packages) use the same version of that dependency.

When "bumping" a dependency, it's important to check if the "upstream" repository also expects other dependencies to be updated (e.g. bumping "moby" in the "docker/cli" repository may require the "swarmkit" dependency to be bumped as well; docker/cli#679). Keeping dependencies "in sync" better guarantees that you're working with the same version of the dependency as the upstream was tested/verified against.

The inception-bot would collect all nested vendor.conf files, compare them with the top-level vendor.conf, and with other nested vendor.conf files to generate a table with "x-commits ahead/behind" for each one:

package g/d/swarmkit g/o/runc g/c/containerd
golang.org/x/sys golang/sys@07c1829...95c6576 (50 commits ahead) golang/sys@07c1829...47bdb83 (9 commits behind) -
github.com/some/package ... (5 ahead) - ... (10 ahead)

CI-helper-bots

Bots that help with CI (and merging):

let-me-merge-that-for-you-bot

The let-me-merge-that-for-you-bot merges the pull request after CI goes "green". This bot could also be used to allows grant reviewers permissions to merge pull requests, without having to give them write access to the repository.

Merging can be triggered:

  • By leaving a comment ("bot: merge on green")
  • By applying the "status/4-merge" label

I-remember-the-days-you-were-still-green-bot

Some pull requests have been open for a while, and although CI status shows a nice and shiny "green", that status may no longer be accurate (given that CI ran days, or even weeks before).

While this problem could be resolved by automatically re-running CI on a schedule, running CI again may not always be nescessary (yet), for example if a pull-request is still in design review, or is a work-in-progress.

The I-remember-the-days-you-were-still-green-bot;

  • Checks if CI status is possibly outdated (number of days since, number of commits since?)
  • Leaves a comment that describes this ("I noticed that CI was last run XX days ago. Want me to run CI again?")
  • The bot will automatically update the existing comment as long as CI hasn't run again (perhaps remove, and re-add the comment to make sure it's always the last comment)
  • Reviewers can:
    • Leave a special comment to trigger CI again
    • Leave a special comment to trigger CI and merge (let-me-merge-that-for-you-bot)
    • Click generated links to do this (?)
  • Once CI has run (either through a comment, or other means, like directly restarting CI in Jenkins), the bot will remove the comment to unclutter the discussion.

Note: ideally, GitHub would allow setting a "stale" threshold for checks, and block merge, but as far as I'm aware it only allows "checks passed" as a constraint.

what-went-wrong-bot

CI sometimes fails (surprise!); sometimes related to the pull request (awesome, it caught a bug!), sometimes because of a bad ("flaky") test.

Discovering what went wrong is time-consuming; having to open the CI logs, scroll through thousands of lines of output (or searching for keywords).

The what-went-wrong-bot would:

  • Summarize CI failures (ideally, formatted with <summary> / <details> for each test that failed)
  • Show a link to the source code of each failing test
  • If many tests failed, skip the per-test summary (as it's likely not informative)
  • Detect general "Jenkins" failures? (just a thought; Jenkins or a Jenkins plugin sometimes fail)
  • Super useful (but likely out of scope for a bit) would be to keep track of failing tests, and automatically mark them as "flaky" (or just statistics of failing tests)
  • Link to issues that keep track of "flaky tests" ("This test is marked as 'flaky' in issue <link>")
  • Allow marking tests as flaky ("mark test X as flaky, or have a clickable link")
  • Git blame links for failing tests? 😇

Other bots

something-something-bot

As mentioned in some of the descriptions above (just thinking) have the bot leave comments with actionable links (click the link to interact)

remind-me-bot

Like slack! "remind x to review this tomorrow"

  • A picture of a cute animal (not mandatory but encouraged)

solar-tortoise-toy-puzzle-funny-cute-little

Image taken from (spam alert 😂): "dhgate.com"

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/projectkind/enhancementEnhancements are not bugs or new features but can improve usability or performance.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions