This repository was archived by the owner on Sep 30, 2024. It is now read-only.
Eliminate usage stats deprecated package#40674
Merged
Merged
Conversation
Contributor
|
Codenotify: Notifying subscribers in CODENOTIFY files for diff bdb0a0d...98e341f.
|
efritz
reviewed
Aug 22, 2022
efritz
reviewed
Aug 22, 2022
efritz
approved these changes
Aug 22, 2022
…stats package instead; add option to unique user count queries to exclude system/backend users
Co-authored-by: Erik Seliger <erikseliger@me.com>
…Usage -> SiteUsageCurrentPeriods; reduce number of slow SQL queries required to get usage stats for admin panel; update mocks and tests
495e6e9 to
aa6de8a
Compare
unknwon
approved these changes
Aug 24, 2022
Comment on lines
+658
to
+703
| FROM event_logs | ||
| WHERE (%s) | ||
| GROUP BY day_period, week_period, month_period, aggregated_user_id, registered | ||
| ), | ||
| unique_users_by_day AS ( | ||
| SELECT | ||
| day_period, | ||
| COUNT(DISTINCT aggregated_user_id) as count, | ||
| COUNT(DISTINCT aggregated_user_id) FILTER (WHERE registered) as count_registered | ||
| FROM unique_users_by_dwm | ||
| GROUP BY day_period | ||
| ), | ||
| unique_users_by_week AS ( | ||
| SELECT | ||
| week_period, | ||
| COUNT(DISTINCT aggregated_user_id) as count, | ||
| COUNT(DISTINCT aggregated_user_id) FILTER (WHERE registered) as count_registered | ||
| FROM unique_users_by_dwm | ||
| GROUP BY week_period | ||
| ), | ||
| unique_users_by_month AS ( | ||
| SELECT | ||
| month_period, | ||
| COUNT(DISTINCT aggregated_user_id) as count, | ||
| COUNT(DISTINCT aggregated_user_id) FILTER (WHERE registered) as count_registered | ||
| FROM unique_users_by_dwm | ||
| GROUP BY month_period | ||
| ) | ||
| SELECT | ||
| all_periods.period, | ||
| all_periods.type, | ||
| COALESCE(CASE WHEN all_periods.type = 'day' | ||
| THEN unique_users_by_day.count | ||
| ELSE CASE WHEN all_periods.type = 'week' | ||
| THEN unique_users_by_week.count | ||
| ELSE unique_users_by_month.count | ||
| END | ||
| END, 0) count, | ||
| COALESCE(CASE WHEN all_periods.type = 'day' | ||
| THEN unique_users_by_day.count_registered | ||
| ELSE CASE WHEN all_periods.type = 'week' | ||
| THEN unique_users_by_week.count_registered | ||
| ELSE unique_users_by_month.count_registered | ||
| END | ||
| END, 0) count_registered | ||
| FROM all_periods |
Contributor
mrnugget
added a commit
that referenced
this pull request
Oct 28, 2022
…overview page (#43591) This removes the "search code, find ref, search symbol" onboarding checklist from - global nav bar - users profile page - admin analytics overview page ## Why? Customer reported severe impact on database performance after upgrading from 3.43.2 to 4.0.1. The reason for the performance regression is a change made in #40674: the `user { usageStats }` GraphQL query would now hit PostgreSQL instead of Redis to get the usage stats. Here is the code path in 3.43.2 for `user { usageStats }`: https://github.com/sourcegraph/sourcegraph/blob/fd7599b9cbf3e80334a6167763b3c74ee9cddf9c/cmd/frontend/internal/usagestatsdeprecated/usage_stats.go#L42-L43 Here it is in 4.0+ https://github.com/sourcegraph/sourcegraph/blob/41e19f207038b4822dc9ebb052eb857e3dae9103/internal/usagestats/usage_stats.go#L110-L136 The SQL queries run by the latter have bad performance because they do multiple queries over one of the biggest tables we have: `event_logs`. Some of the performance problems with that have been addressed in https://github.com/sourcegraph/sourcegraph/pull/43564 and https://github.com/sourcegraph/sourcegraph/pull/43582. Big problem: **these queries are run on every page load, due to the `withActivation` wrapper we had around the `Layout` component.** That wrapper would load the `usageStatistics` for the currently-logged-in user on every page load to display that onboarding checklist at the top. On the customer's instance this resulted in slow queries piling up in the database, pinning CPU usage at 100% and exhausting the database connection limits. After discussing the issue in Slack, [we decided to remove this component altogether](https://sourcegraph.slack.com/archives/C03TGHH1S3V/p1666951569092379?thread_ts=1666747586.140109&cid=C03TGHH1S3V), since we deem the risk low: the component has been unchanged in 3 years, it's unloved and unowned, it has performance implications, and it most likely doesn't have any effect on user onboarding.
BolajiOlajide
pushed a commit
that referenced
this pull request
Oct 28, 2022
…overview page (#43591) This removes the "search code, find ref, search symbol" onboarding checklist from - global nav bar - users profile page - admin analytics overview page Customer reported severe impact on database performance after upgrading from 3.43.2 to 4.0.1. The reason for the performance regression is a change made in #40674: the `user { usageStats }` GraphQL query would now hit PostgreSQL instead of Redis to get the usage stats. Here is the code path in 3.43.2 for `user { usageStats }`: https://github.com/sourcegraph/sourcegraph/blob/fd7599b9cbf3e80334a6167763b3c74ee9cddf9c/cmd/frontend/internal/usagestatsdeprecated/usage_stats.go#L42-L43 Here it is in 4.0+ https://github.com/sourcegraph/sourcegraph/blob/41e19f207038b4822dc9ebb052eb857e3dae9103/internal/usagestats/usage_stats.go#L110-L136 The SQL queries run by the latter have bad performance because they do multiple queries over one of the biggest tables we have: `event_logs`. Some of the performance problems with that have been addressed in https://github.com/sourcegraph/sourcegraph/pull/43564 and https://github.com/sourcegraph/sourcegraph/pull/43582. Big problem: **these queries are run on every page load, due to the `withActivation` wrapper we had around the `Layout` component.** That wrapper would load the `usageStatistics` for the currently-logged-in user on every page load to display that onboarding checklist at the top. On the customer's instance this resulted in slow queries piling up in the database, pinning CPU usage at 100% and exhausting the database connection limits. After discussing the issue in Slack, [we decided to remove this component altogether](https://sourcegraph.slack.com/archives/C03TGHH1S3V/p1666951569092379?thread_ts=1666747586.140109&cid=C03TGHH1S3V), since we deem the risk low: the component has been unchanged in 3 years, it's unloved and unowned, it has performance implications, and it most likely doesn't have any effect on user onboarding.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Revisiting https://github.com/sourcegraph/sourcegraph/pull/38826
That PR had to be reverted due to slow performance and timeouts on the admin analytics page on dotcom. After testing locally with >30 million events in my local db, I could reproduce it (generating the admin chart took ~3 minutes).
The implementation relied on a SQL query that took ~20-30 seconds, which was called 9 times. I've slimmed it down to just one query that takes about the same amount of time (~30s).
First, I removed the integration user data—it was never actually used in any of the calls to this method.
Second, I changed the SQL to collect both registered and anon user data in a single pass.
Finally, I changed the SQL to aggregate users into DAU, WAU, and MAU buckets in a single pass as well.
I borrowed from the
SiteUsagequery (used for ping data aggregation), which is largely duplicative with this one, but is missing one key feature: It only allows you to pull the current DAUs, WAUs, and MAUs. If you want, e.g., "the last 3 months" of MAUs, you have to use this query/API call instead. I updated the naming of both functions to make it clear that they are related.For reviewers, fell free to just look at the last four commits. The first six are identical to the first PR (just rebased).
Test plan
Tested locally