fix: Improve Postgres performance by erezrokah · Pull Request #13318 · cloudquery/cloudquery

erezrokah · 2023-08-24T19:55:44Z

Summary

Still need to figure out how to replace the logic I deleted, but listing information schema on each batch insert is creating a bottleneck (saw it on pprof). Probably due to read/write locks or our queries to list tables and columns are slow.
Changes in this PR result in about x10 improvement.

Spec used (I used an old source since I tested on an old Postgres version to get the expected numbers):

kind: source
spec:
  name: aws
  registry: "github"
  path: "cloudquery/aws"
  version: "v19.0.0"
  tables:
    - "*"
  skip_tables:
    - "aws_cloudtrail_*"
    - "aws_iam_*"
    - "aws_servicequotas_*"
  destinations:
    - postgresql
---
kind: destination
spec:
  name: "postgresql"
  path: "cloudquery/postgresql"
  version: "v5.0.5"
  migrate_mode: forced
  spec:
    connection_string: "postgresql://postgres:pass@localhost:5432/postgres?sslmode=disable"

Before:

After (with Postgres running from localhost with this fix):

candiduslynx · 2023-08-24T20:04:59Z

plugins/destination/postgresql/client/spec.go

 	defaultBatchSize      = 10000
-	defaultBatchSizeBytes = 1000000
-	defaultBatchTimeout   = 10 * time.Second
+	defaultBatchSizeBytes = 100000000


also should be in the doc + should be a major bump

erezrokah · 2023-08-25T05:47:14Z

Going to split this PR so we can release the removal of list tables in a non breaking change

erezrokah · 2023-08-25T09:45:26Z

Closing in favor of #13324 and #13323

#### Summary Extracted from #13318. Witnessed the bottleneck using `pprof`. With default batch settings I'm getting `2m29s` sync time instead of ~`11m`, With `batch_size_bytes: 100000000` and `batch_timeout: 60s` I'm getting `1m40s`. (Fixed in #13324) Used spec: ```yaml kind: source spec: name: aws path: "cloudquery/aws" version: "v19.0.0" tables: - "*" skip_tables: - "aws_cloudtrail_*" - "aws_iam_*" - "aws_servicequotas_*" destinations: - postgresql --- kind: destination spec: name: "postgresql" registry: "grpc" path: localhost:8888 # path: cloudquery/postgresql # version "v5.0.5" spec: connection_string: "postgresql://postgres:pass@localhost:5432/postgres?sslmode=disable" # batch_size_bytes: 100000000 # batch_timeout: "60s" ``` <!--

#### Summary Extracted from #13318. BEGIN_COMMIT_OVERRIDE feat: Increase default batch size bytes to `100000000` (100 MB) and default batch timeout to `60` seconds. BREAKING-CHANGE: Increase default batch size bytes to `100000000` (100 MB) and default batch timeout to `60` seconds. We discovered a default higher batch size bytes and timeout settings provide better out of the box performance for the PostgreSQL destination. We're marking it as a breaking change as it might increase memory consumption in some environments. END_COMMIT_OVERRIDE

fix: Improve Postgres performance

e66ed02

cq-bot added the postgresql label Aug 24, 2023

candiduslynx reviewed Aug 24, 2023

View reviewed changes

This was referenced Aug 25, 2023

fix: Don't list Postgres tables during insert #13323

Merged

feat: Increase default batch size bytes and timeout #13324

Merged

erezrokah closed this Aug 25, 2023

erezrokah deleted the fix/postgres_performance branch August 25, 2023 09:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Improve Postgres performance#13318

fix: Improve Postgres performance#13318
erezrokah wants to merge 1 commit intocloudquery:mainfrom
erezrokah:fix/postgres_performance

erezrokah commented Aug 24, 2023 •

edited

Loading

Uh oh!

candiduslynx Aug 24, 2023

Uh oh!

erezrokah commented Aug 25, 2023

Uh oh!

erezrokah commented Aug 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

erezrokah commented Aug 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

candiduslynx Aug 24, 2023

Choose a reason for hiding this comment

Uh oh!

erezrokah commented Aug 25, 2023

Uh oh!

erezrokah commented Aug 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

erezrokah commented Aug 24, 2023 •

edited

Loading