Skip to content

feat: Add sharding support#19169

Merged
kodiakhq[bot] merged 6 commits intocloudquery:mainfrom
erezrokah:feat/sharding
Sep 19, 2024
Merged

feat: Add sharding support#19169
kodiakhq[bot] merged 6 commits intocloudquery:mainfrom
erezrokah:feat/sharding

Conversation

@erezrokah
Copy link
Copy Markdown
Member

Summary

We'll need to release a few plugins with cloudquery/plugin-sdk#1891 first, hence the future date in the command description

@erezrokah erezrokah requested review from a team and marianogappa and removed request for a team September 17, 2024 17:58
syncTime := time.Now().UTC().Truncate(time.Microsecond)
sourceName := sourceSpec.Name
if shard != nil {
sourceName = fmt.Sprintf("%s-%d/%d", sourceName, shard.num, shard.total)
Copy link
Copy Markdown
Member Author

@erezrokah erezrokah Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to support overwrite-delete-stale otherwise shards delete each other's data at the end of the sync.

We can use a better name for it, maybe %s_shard_%d_%d

cli/cmd/sync.go Outdated
cmd.Flags().String("license", "", "set offline license file")
cmd.Flags().String("summary-location", "", "Sync summary file location. This feature is in Preview. Please provide feedback to help us improve it.")
cmd.Flags().String("tables-metrics-location", "", "Tables metrics file location. This feature is in Preview. Please provide feedback to help us improve it. Works with plugins released on 2024-07-10 or later.")
cmd.Flags().String("shard", "", "Allows splitting the sync process into multiple shards. This feature is in Preview. Please provide feedback to help us improve it. Works with plugins released on 2024-09-24 or later.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not mention that it only works for plugins using the default scheduler (e.g. not S3), but I realise it's a little tricky to put this in words.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it doesn't work with Docker plugins as well 🙃

I think that will be in the update of https://docs.cloudquery.io/docs/advanced-topics/running-cloudquery-in-parallel and https://docs.cloudquery.io/docs/deployment/github-actions

Copy link
Copy Markdown
Member Author

@erezrokah erezrokah Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so did fff4d1d, I'll update the doc with a table of support plugins and versions

@erezrokah
Copy link
Copy Markdown
Member Author

I'll merge this once we have at least AWS, GCP and Azure published with this support

@erezrokah erezrokah added the automerge Automatically merge once required checks pass label Sep 19, 2024
@kodiakhq kodiakhq bot merged commit e9dfd0b into cloudquery:main Sep 19, 2024
kodiakhq bot pushed a commit that referenced this pull request Sep 19, 2024
🤖 I have created a release *beep* *boop*
---


## [6.8.0](cli-v6.7.1...cli-v6.8.0) (2024-09-19)


### Features

* Add sharding support ([#19169](#19169)) ([e9dfd0b](e9dfd0b))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cli area/website automerge Automatically merge once required checks pass

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants