Skip to content

postgresql: use cyrilgdn/terraform-provider-postgresql#448#11

Merged
bobheadxi merged 1 commit into
mainfrom
postgresql-gcp-impersonate
Jun 17, 2024
Merged

postgresql: use cyrilgdn/terraform-provider-postgresql#448#11
bobheadxi merged 1 commit into
mainfrom
postgresql-gcp-impersonate

Conversation

@bobheadxi

@bobheadxi bobheadxi commented Jun 17, 2024

Copy link
Copy Markdown
Member

@bobheadxi bobheadxi requested review from a team and michaellzc June 17, 2024 20:58
@bobheadxi bobheadxi merged commit f286e77 into main Jun 17, 2024
@bobheadxi bobheadxi deleted the postgresql-gcp-impersonate branch June 17, 2024 21:01
bobheadxi added a commit to sourcegraph/sourcegraph-public-snapshot that referenced this pull request Jul 5, 2024
…ream (#63092)

Adds a new `postgreSQL.logicalReplication` configuration to allow MSP to
generate prerequisite setup for integration with Datastream:
https://cloud.google.com/datastream/docs/sources-postgresql. Integration
with Datastream allows the Data Analytics team to self-serve data
enrichment needs for the Telemetry V2 pipeline.

Enabling this feature entails downtime (Cloud SQL instance restart), so
enabling the logical replication feature at the Cloud SQL level
(`cloudsql.logical_decoding`) is gated behind
`postgreSQL.logicalReplication: {}`.

Setting up the required stuff in Postgres is a bit complicated,
requiring 3 Postgres provider instances:

1. The default admin one, authenticated with our admin user
2. New: a workload identity provider, using
cyrilgdn/terraform-provider-postgresql#448 /
sourcegraph/managed-services-platform-cdktf#11.
This is required for creating a publication on selected tables, which
requires being owner of said table. Because tables are created by
application using e.g. auto-migrate, the workload identity is always the
table owner, so we need to impersonate the IAM user
3. New: a "replication user" which is created with the replication
permission. Replication seems to not be a propagated permission so we
need a role/user that has replication enabled.

A bit more context scattered here and there in the docstrings.

Beyond the Postgres configuration we also introduce some additional
resources to enable easy Datastream configuration:

1. Datastream Private Connection, which peers to the service private
network
2. Cloud SQL Proxy VM, which only allows connections to `:5432` from the
range specified in 1, allowing a connection to the Cloud SQL instance
2. Datastream Connection Profile attached to 1

From there, data team can click-ops or manage the Datastream Stream and
BigQuery destination on their own.

Closes CORE-165
Closes CORE-212

Sample config:

```yaml
  resources:
    postgreSQL:
      databases:
        - "primary"
      logicalReplication:
        publications:
          - name: testing
            database: primary
            tables:
              - users
```

## Test plan

sourcegraph/managed-services#1569

## Changelog

- MSP services can now configure `postgreSQL.logicalReplication` to
enable Data Analytics team to replicate selected database tables into
BigQuery.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants