Conversation
3ee57ec to
21ce5d1
Compare
21ce5d1 to
1014bbc
Compare
Co-authored-by: Herman Schaaf <hermanschaaf@gmail.com>
erezrokah
left a comment
There was a problem hiding this comment.
This looks much better 🚀 Added a few comments
plugins/source.go
Outdated
| @@ -127,24 +132,22 @@ func (p *SourcePlugin) Sync(ctx context.Context, spec specs.Source, res chan<- * | |||
|
|
|||
| // limiter used to limit the amount of resources fetched concurrently | |||
| maxGoroutines := spec.MaxGoRoutines | |||
There was a problem hiding this comment.
How about changing MaxGoRoutines to Concurrency? This will give us some flexibility in how we implement it. For example we might want to add additional parallelism for specific resources/relations
plugins/source.go
Outdated
| const minGoRoutines = 5 | ||
| const ( | ||
| minGoRoutines = 5 | ||
| defaultMaxGoRoutines = 500000 |
There was a problem hiding this comment.
| defaultMaxGoRoutines = 500000 | |
| defaultMaxGoRoutines = 500000 |
Does this mean that by default we spin up 500000 go routines? Also what's the reason for having a hard min? Shouldn't we have only GoRoutinesCount and ensure it's not < 0 (0 meaning default value)
There was a problem hiding this comment.
Yeah I guess you are right - it was from older piece of code where we had an issue with 1-4 go routines.
There was a problem hiding this comment.
Approved with a comment on the default var name.
Also this is also a breaking change so better to merge as fix!:.
Finally I think the current default means "unlimited" at the moment. Not sure that's the best default as we might hit rate limits quite fast
Co-authored-by: Erez Rokah <erezrokah@users.noreply.github.com>
Co-authored-by: Erez Rokah <erezrokah@users.noreply.github.com>
Thanks! The default means 500K which is I think reasonable for 1 CPU 4GB RAM which should be pretty much anyone |
I should have explained it better. I'm not worried about the machine specs, but being throttled/rate limited by the service API |
I think concurrency is the right parameter are mainly to deal with machine spec and too many open files and so on. and max_back_off, max_retries are usually the right params to deal with throttle on the server side |
We had concurrency limits disabled. Fixed it and also fixed the issue that we had with timings (now the time really shows how long it took for a goroutine to fetch without wait time).
This should work instead of #103
Use the following steps to ensure your PR is ready to be reviewed
go fmtto format your code 🖊golangci-lint run🚨 (install golangci-lint here)