-
Notifications
You must be signed in to change notification settings - Fork 8k
Description
Describe the feature or problem you’d like to solve
In testing against larger repositories with > 100,000 PRs, the gh pr status command times out or comes close to it. Looking at the queries with GH_DEBUG=api enabled, the command requests much more data than it displays, including the very verbose and often-slow statusCheckRollup objects.
Proposed solution
Recently a checkRunsByState field was added to the StatusCheckRollup GraphQL objects that is dramatically faster - in experimenting with the same queries, but using the new field returns results dramatically faster against the same data, within 1-2 seconds. An example query is provided below; the variables can be provided from running your own GH_DEBUG=api gh pr status command.
Gentler GraphQL query
fragment pr on PullRequest {
number
title
state
url
isDraft
isCrossRepository
headRefName
headRepositoryOwner {
id
login
... on User {
name
}
}
statusCheckRollup: commits(last: 1) {
nodes {
commit {
statusCheckRollup {
contexts {
checkRunCount
checkRunCountsByState {
state
count
}
statusContextCount
statusContextCountsByState {
state
count
}
}
}
}
}
}
baseRef {
branchProtectionRule {
requiresStrictStatusChecks
}
}
}
fragment prWithReviews on PullRequest {
...pr
reviewDecision
latestReviews(first: 100) {
nodes {
author {
login
}
authorAssociation
submittedAt
body
state
}
}
}
query PullRequestStatus($owner: String!, $repo: String!, $viewerQuery: String!, $reviewerQuery: String!, $per_page: Int = 10) {
repository(owner: $owner, name: $repo) {
defaultBranchRef {
name
}
pullRequests(first: $per_page, orderBy: {field: CREATED_AT, direction: DESC}) {
totalCount
edges {
node {
...prWithReviews
}
}
}
}
viewerCreated: search(query: $viewerQuery, type: ISSUE, first: $per_page) {
totalCount: issueCount
edges {
node {
...prWithReviews
}
}
}
reviewRequested: search(query: $reviewerQuery, type: ISSUE, first: $per_page) {
totalCount: issueCount
edges {
node {
...pr
}
}
}
}Additional context
I do recognize that the command can return much more of the requested data with the --json flag, so I'm not sure if it might make sense to only use a lighter query if running for the command's standard output while allowing the --json flag to use the current heavier query to help prevent current workflows from breaking.
I also haven't looked through the code to see if there are other queries in which these new fields could be used, nor have I checked to see if any other fields are being requested that aren't used in the standard status output, so those might be additional avenues for optimizing queries.