resource/github_repository_file: Add path parameter to reduce Github API calls#589
resource/github_repository_file: Add path parameter to reduce Github API calls#589jcudit merged 2 commits intointegrations:masterfrom
Conversation
If you have a repository with a high number of commits, you will quickly hit the Github API limit when trying to get all the commits for the repo. Adding the path parameter will limit the commits to only include the file in question, which will significantly reduce the number of API calls.
|
Thanks for this optimization! Seems like a great candidate for /cc https://github.com/terraform-providers/terraform-provider-github/pull/589 |
jcudit
left a comment
There was a problem hiding this comment.
v4.0.0 took around 8 minutes to create 100 files in a repo. Destroying failed due to being rate limited for a duration longer than the command timeout.
Apply complete! Resources: 101 added, 0 changed, 0 destroyed.
real 8m44.234s
user 0m4.589s
sys 0m2.522s
Error: GET https://api.github.com/repos/terraformtesting/repository-file-test/branches/main: 403 API rate limit of 5000 still exceeded until 2020-11-12 17:45:30 -0500 EST, not making remote request. [rate reset in 14m41s]
real 30m31.630s
user 0m5.956s
sys 0m3.690s
This new version clocked better on both creates and destroys:
Apply complete! Resources: 101 added, 0 changed, 0 destroyed.
real 4m26.469s
user 0m3.410s
sys 0m1.522s
@cnelissen thanks for this. Welcome any other optimizations you can spot in the future!
…API calls (integrations#589) * Update resource_github_repository_file.go If you have a repository with a high number of commits, you will quickly hit the Github API limit when trying to get all the commits for the repo. Adding the path parameter will limit the commits to only include the file in question, which will significantly reduce the number of API calls. * Fix formatting
If you have a repository with a very high number of commits, and you try to add a managed file using this provider, you will hit the Github API limit before the state for the file can be refreshed. The code currently returns ALL repository commits (https://github.com/terraform-providers/terraform-provider-github/blob/900aa92730cea0f5438063bfd410330749e89be3/github/resource_github_repository_file.go#L342), and then iterates through them to see if the commit has the managed file in its tree, and if so, returns the commit info. Since Github limits API calls to 5,000 per hour, any repository with ~5,000 commits in its history will not be able to complete even one state refresh before hitting the limit.
Luckily, the Github API supports a path parameter (/repos/:owner/:repo/commits?path=path/to/file), which will only return commits for that file. This will significantly reduce the number of API calls that the provider needs to make to return the correct information for state.