Skip to content

gh pr status GraphQL query requests more StatusCheckRollup data than required, causing timeouts in large repositories #7421

@rufo

Description

@rufo

Describe the feature or problem you’d like to solve

In testing against larger repositories with > 100,000 PRs, the gh pr status command times out or comes close to it. Looking at the queries with GH_DEBUG=api enabled, the command requests much more data than it displays, including the very verbose and often-slow statusCheckRollup objects.

Proposed solution

Recently a checkRunsByState field was added to the StatusCheckRollup GraphQL objects that is dramatically faster - in experimenting with the same queries, but using the new field returns results dramatically faster against the same data, within 1-2 seconds. An example query is provided below; the variables can be provided from running your own GH_DEBUG=api gh pr status command.

Gentler GraphQL query
fragment pr on PullRequest {
  number
  title
  state
  url
  isDraft
  isCrossRepository
  headRefName
  headRepositoryOwner {
    id
    login
    ... on User {
      name
    }
  }
  statusCheckRollup: commits(last: 1) {
    nodes {
      commit {
        statusCheckRollup {
          contexts {
            checkRunCount
            checkRunCountsByState {
              state
              count
            }
            statusContextCount
            statusContextCountsByState {
              state
              count
            }
          }
        }
      }
    }
  }
  baseRef {
    branchProtectionRule {
      requiresStrictStatusChecks
    }
  }
}

fragment prWithReviews on PullRequest {
  ...pr
  reviewDecision
  latestReviews(first: 100) {
    nodes {
      author {
        login
      }
      authorAssociation
      submittedAt
      body
      state
    }
  }
}

query PullRequestStatus($owner: String!, $repo: String!, $viewerQuery: String!, $reviewerQuery: String!, $per_page: Int = 10) {
  repository(owner: $owner, name: $repo) {
    defaultBranchRef {
      name
    }
    pullRequests(first: $per_page, orderBy: {field: CREATED_AT, direction: DESC}) {
      totalCount
      edges {
        node {
          ...prWithReviews
        }
      }
    }
  }
  viewerCreated: search(query: $viewerQuery, type: ISSUE, first: $per_page) {
    totalCount: issueCount
    edges {
      node {
        ...prWithReviews
      }
    }
  }
  reviewRequested: search(query: $reviewerQuery, type: ISSUE, first: $per_page) {
    totalCount: issueCount
    edges {
      node {
        ...pr
      }
    }
  }
}

Additional context

I do recognize that the command can return much more of the requested data with the --json flag, so I'm not sure if it might make sense to only use a lighter query if running for the command's standard output while allowing the --json flag to use the current heavier query to help prevent current workflows from breaking.

I also haven't looked through the code to see if there are other queries in which these new fields could be used, nor have I checked to see if any other fields are being requested that aren't used in the standard status output, so those might be additional avenues for optimizing queries.

Metadata

Metadata

Assignees

Labels

enhancementa request to improve CLI

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions