Skip to content

very poor performance on aws_emr_clusters #2392

@jimr6007

Description

@jimr6007

Describe the Bug

with my source spec as such:

kind: source
spec:
  name: aws
  version: v1.0.1
  destinations:
    -  postgresql
  tables:
    - aws_emr_clusters
  spec:
    regions:
      - us-east-1
      - us-east-2
      - us-west-1
      - us-west-2
    accounts:
      - id: foursquare
        local_profile: foursquare-administrator`

the cq-cli sync run immediately starts out at less than 10 resources per second

have been troubleshooting poor perf for a few days now and I'm pretty sure I have narrowed it down to this table being the culprit

Expected Behavior

sync of aws_emr_clusters takes a "reasonable" amount of time
I know that's hard to say what is reasonable but over the past months we've seen it take anywhere between 5-6 hours to nealry 24 hours

CloudQuery Version

started out with a 0.x but currently on cloudquery version 1.0.2
and 1.0.1 of the aws plugin
and 1.0.0 of the postgres plugin

Debug Output

as of today I'm seeing no logs
could be a regression of this issue I mentioned on Discord yesterday
https://discord.com/channels/872925471417962546/873606591335759872/1026612519550320651
https://discord.com/channels/872925471417962546/873606591335759872/1026616765423308852

Steps to Reproduce

run cloudquery sync with a source plugin config'd as described at the beginning of this issue rpeort

Additional Context

References

it's sort of related to #2299
only in that this all started with me going "why is this taking so long?"

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions