fix: Github source pagination#11727
Merged
kodiakhq[bot] merged 2 commits intocloudquery:mainfrom Jun 24, 2023
Merged
Conversation
It appears that Github pagination is not working correctly. Specifically, we are missing out on the final page of results. For the recommended way to perform pagination, see: https://github.com/google/go-github/#pagination.
b008702 to
e5d48c4
Compare
candiduslynx
approved these changes
Jun 24, 2023
Contributor
candiduslynx
left a comment
There was a problem hiding this comment.
Great find, thanks!
kodiakhq bot
pushed a commit
that referenced
this pull request
Jul 4, 2023
🤖 I have created a release *beep* *boop* --- ## [6.0.3](plugins-source-github-v6.0.2...plugins-source-github-v6.0.3) (2023-07-04) ### Bug Fixes * **deps:** Update github.com/apache/arrow/go/v13 digest to 5a06b2e ([#11857](#11857)) ([43c2f5f](43c2f5f)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to 0656028 ([#11739](#11739)) ([7a6ad49](7a6ad49)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to 1e68c51 ([#11637](#11637)) ([46043bc](46043bc)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to 43638cb ([#11672](#11672)) ([3c60bbb](3c60bbb)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to 4d76231 ([#11532](#11532)) ([6f04233](6f04233)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to 8366a22 ([#11717](#11717)) ([8eeff5b](8eeff5b)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to 95d3199 ([#11708](#11708)) ([03f214f](03f214f)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to b0832be ([#11651](#11651)) ([71e8c29](71e8c29)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to d864719 ([#11611](#11611)) ([557a290](557a290)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to df3b664 ([#11882](#11882)) ([9635b22](9635b22)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to f060192 ([#11730](#11730)) ([c7019c2](c7019c2)) * **deps:** Update github.com/cloudquery/arrow/go/v13 digest to f0dffc6 ([#11689](#11689)) ([18ac0e9](18ac0e9)) * **deps:** Update module github.com/cloudquery/plugin-pb-go to v1.1.0 ([#11665](#11665)) ([d8947c9](d8947c9)) * **deps:** Update module github.com/cloudquery/plugin-pb-go to v1.2.0 ([#11720](#11720)) ([7ef521d](7ef521d)) * **deps:** Update module github.com/cloudquery/plugin-pb-go to v1.2.1 ([#11722](#11722)) ([309be72](309be72)) * **deps:** Update module github.com/cloudquery/plugin-pb-go to v1.3.3 ([#11726](#11726)) ([f0ca611](f0ca611)) * **deps:** Update module github.com/cloudquery/plugin-pb-go to v1.3.4 ([#11753](#11753)) ([cd4fe1c](cd4fe1c)) * **deps:** Update module github.com/cloudquery/plugin-pb-go to v1.5.0 ([#11850](#11850)) ([3255857](3255857)) * **deps:** Update module github.com/cloudquery/plugin-pb-go to v1.6.0 ([#11916](#11916)) ([421e752](421e752)) * **deps:** Update module github.com/cloudquery/plugin-sdk/v3 to v3.10.6 ([#11473](#11473)) ([7272133](7272133)) * Github source pagination ([#11727](#11727)) ([f830ede](f830ede)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
akash1810
pushed a commit
to guardian/service-catalogue
that referenced
this pull request
Jul 6, 2023
cloudquery/cloudquery#11582 (comment) This is to fix some missing data. See: cloudquery/cloudquery#11727
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
It appears that Github pagination is not working correctly. Specifically, we are missing out on the final page of results.
For the recommended way to perform pagination, see: https://github.com/google/go-github/#pagination.
It is fair to say that the Google Github library's behaviour here is a bit confusing and easy to get wrong!
I am surprised that, if correct, this issue was not detected sooner - so please do take my findings with caution and confirm them for yourselves.
How diagnosed:
Note, I have only confirmed this impacts the
github_team_repositoriestable but I assume from the code that this issue is actually much more widespread.The issue initially manifested as suspicious data. Running the following query on a Postgres instance we use as a destination:
select team_id, count(team_id) from github_team_repositories group by team_id;yielded some surprisingly round numbers (all divisible by 100 other than results with < 100):
Following these changes, the numbers look more 'natural'.