Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

alerts: Do not alert on long transactions in codeintel-db#36619

Merged
chrismwendt merged 3 commits into
mainfrom
do-not-alert-on-codeintel-db-long-transactions
Jul 26, 2022
Merged

alerts: Do not alert on long transactions in codeintel-db#36619
chrismwendt merged 3 commits into
mainfrom
do-not-alert-on-codeintel-db-long-transactions

Conversation

@chrismwendt

@chrismwendt chrismwendt commented Jun 6, 2022

Copy link
Copy Markdown
Contributor

During normal Rockskip indexing, some transactions that update big hops can take tens of minutes to finish. It would be nice if we could selectively ignore Rockskip transactions while still alerting on other codeintel-db transactions. Any idea how to do that easily? This PR omits codeintel-db from the check to at least stop the spurious alerts.

CleanShot 2022-06-05 at 19 46 08

From https://github.com/sourcegraph/customer/issues/976#issuecomment-1145369932

Test plan

Ran in Grafana

@chrismwendt chrismwendt requested a review from efritz June 6, 2022 01:45
@cla-bot cla-bot Bot added the cla-signed label Jun 6, 2022
@sourcegraph-bot

sourcegraph-bot commented Jun 6, 2022

Copy link
Copy Markdown
Contributor

Codenotify: Notifying subscribers in CODENOTIFY files for diff 36c320a...02f0d68.

Notify File(s)
@bobheadxi monitoring/definitions/postgres.go
@slimsag monitoring/definitions/postgres.go
@sourcegraph/delivery doc/admin/observability/dashboards.md
@sourcegraph/distribution monitoring/definitions/postgres.go

@efritz

efritz commented Jul 6, 2022

Copy link
Copy Markdown
Contributor

@chrismwendt Is it possible to do batches of smaller transactions in the future? Note that long transactions will affect things like the ability to run schema migrations in a reasonable time (if txns are open for 10s of minutes, it can block lock access if we need to create an index, for example).

@chrismwendt

Copy link
Copy Markdown
Contributor Author

Unfortunately, holding long transactions is a fairly deeply-rooted requirement in the current implementation of Rockskip. Specifically, there's a single query which regularly updates ~50% of the symbols in the repo. After thinking about this for ~10 minutes it's still not obvious to me how best to break up the transactions, and actually making the change would probably take days.

That's not to say it's impossible, but I'm wondering if we can get away with keeping the long transactions. In which cases does blocking index creation cause problems? The first thing that comes to mind is the 1 hour timeout for DB liveness.

AFAICT, a long transaction only blocks index creation on tables that the transaction has written to:

thread 1: BEGIN;
thread 1: INSERT INTO table1 (column1) VALUES ('value1');
thread 2: CREATE INDEX CONCURRENTLY index1 ON table1 (column1); <-- 🚫 Blocked by thread 1
thread 1: COMMIT;
thread 1: BEGIN;
thread 1: INSERT INTO table1 (column1) VALUES ('value1');
thread 2: CREATE INDEX CONCURRENTLY index2 ON table2 (column1); <-- ✅ Not blocked by thread 1
                                              ^^^^^^ different table: "table2"
thread 1: COMMIT;

@efritz efritz left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrismwendt Could you add a comment pre-merge about this being disabled because of rockskip processing? Then if we ever are able to reduce txn time we can revisit this alert.

@chrismwendt chrismwendt enabled auto-merge (squash) July 26, 2022 21:39
@chrismwendt chrismwendt merged commit 477f4cf into main Jul 26, 2022
@chrismwendt chrismwendt deleted the do-not-alert-on-codeintel-db-long-transactions branch July 26, 2022 22:20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants