Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

[fix] use same where condition in user analytics as in pings for mau#52306

Merged
kopancek merged 7 commits into
mainfrom
milan/site_usage_stats
May 26, 2023
Merged

[fix] use same where condition in user analytics as in pings for mau#52306
kopancek merged 7 commits into
mainfrom
milan/site_usage_stats

Conversation

@kopancek

@kopancek kopancek commented May 23, 2023

Copy link
Copy Markdown
Contributor

Previously we used a different condition to calculate the MAU and other stats in user analytics page. Now we use the same condition as we use for pings. We also calculate everything using UTC now, so both should give the same value.

Test plan

Tested locally and also with data from Tinder. The new query for MAU seems to be returning the correct data for us, I manually cross-referenced the old query vs the new. Apart from the timezone difference, the new query is more precise as in it filters the sourcegraph-operator users and non-user events in the same way as we do for the pings.

@kopancek kopancek self-assigned this May 23, 2023
@cla-bot cla-bot Bot added the cla-signed label May 23, 2023
Previously we used a different condition to calculate the MAU and other
stats in user analytics page. Now we use the same condition as we use for
pings. We also calculate everything using UTC now, so both should give
the same value.
@kopancek kopancek force-pushed the milan/site_usage_stats branch from 8f868a3 to 676946d Compare May 23, 2023 14:44
@kopancek kopancek marked this pull request as ready for review May 25, 2023 12:50
@sourcegraph-bot

Copy link
Copy Markdown
Contributor

📖 Storybook live preview

@kopancek kopancek requested a review from a team May 25, 2023 14:54

@erzhtor erzhtor left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these changes, looks cleaner! Besides minor questions/comments looks good to me!

SELECT
CASE WHEN user_id = 0 THEN anonymous_user_id ELSE CAST(user_id AS TEXT) END AS user_id,
COUNT(DISTINCT DATE(timestamp)) AS days_used
COUNT(DISTINCT DATE(TIMEZONE('UTC', timestamp))) AS days_used

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why do we need this? I thought timestamps are stored in UTC anyway, aren't they?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the default timezone in the database is whatever the system it is running on default is. So in my case it's using EU timezone.

Comment thread internal/adminanalytics/utils.go
Comment thread internal/adminanalytics/utils.go
@@ -163,12 +161,7 @@ func (f *Users) MonthlyActiveUsers(ctx context.Context) ([]*MonthlyActiveUsersRo
prevMonth := now.AddDate(0, -2, 0) // going back 2 months

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous month timeframe calculations can be troubling too.

@0xnmn 0xnmn left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes give the exact numbers as the pings?

@kopancek kopancek merged commit 8d04aa1 into main May 26, 2023
@kopancek kopancek deleted the milan/site_usage_stats branch May 26, 2023 12:17
kopancek added a commit that referenced this pull request May 30, 2023
This is for usage analytics in the product. Previously we were filtering
out the backend events, but as part of #52306 this was removed, because
I believed it shold not be there, as it was not part of the BuildCommonUsageConds
in the event_logs. However event_logs also add this condition directly
in the query string, which I did not notice.
kopancek added a commit that referenced this pull request May 30, 2023
Fixes MAU calculation in usage analytics in the product. 

Previously we were filtering out the backend events, but as part of
#52306 this was removed, because I believed it shold not be there, as it
was not part of the BuildCommonUsageConds in the event_logs. However
event_logs also add this condition directly in the query string, which I
did not notice.

Also, our timestamp calculation suffered from an edge case, where the
from time was not calculated right, e.g. on dates like
`2023-04-30T23:59:00Z`.

## Test plan

Tested locally and with customer data + added unit tests.
kopancek added a commit that referenced this pull request May 31, 2023
…52306)

Previously we used a different condition to calculate the MAU and other
stats in user analytics page. Now we use the same condition as we use
for pings. We also calculate everything using UTC now, so both should
give the same value.

## Test plan

Tested locally and also with data from Tinder. The new query for MAU
seems to be returning the correct data for us, I manually
cross-referenced the old query vs the new. Apart from the timezone
difference, the new query is more precise as in it filters the
sourcegraph-operator users and non-user events in the same way as we do
for the pings.

(cherry picked from commit 8d04aa1)
coury-clark pushed a commit that referenced this pull request May 31, 2023
… pings for mau (#52608)

Previously we used a different condition to calculate the MAU and other
stats in user analytics page. Now we use the same condition as we use
for pings. We also calculate everything using UTC now, so both should
give the same value.

## Test plan

Tested locally and also with data from Tinder. The new query for MAU
seems to be returning the correct data for us, I manually
cross-referenced the old query vs the new. Apart from the timezone
difference, the new query is more precise as in it filters the
sourcegraph-operator users and non-user events in the same way as we do
for the pings.
 <br> Backport 8d04aa1 from #52306

Co-authored-by: Milan Freml <kopancek@users.noreply.github.com>
ErikaRS pushed a commit that referenced this pull request Jun 22, 2023
…52306)

Previously we used a different condition to calculate the MAU and other
stats in user analytics page. Now we use the same condition as we use
for pings. We also calculate everything using UTC now, so both should
give the same value.

## Test plan

Tested locally and also with data from Tinder. The new query for MAU
seems to be returning the correct data for us, I manually
cross-referenced the old query vs the new. Apart from the timezone
difference, the new query is more precise as in it filters the
sourcegraph-operator users and non-user events in the same way as we do
for the pings.
ErikaRS pushed a commit that referenced this pull request Jun 22, 2023
Fixes MAU calculation in usage analytics in the product. 

Previously we were filtering out the backend events, but as part of
#52306 this was removed, because I believed it shold not be there, as it
was not part of the BuildCommonUsageConds in the event_logs. However
event_logs also add this condition directly in the query string, which I
did not notice.

Also, our timestamp calculation suffered from an edge case, where the
from time was not calculated right, e.g. on dates like
`2023-04-30T23:59:00Z`.

## Test plan

Tested locally and with customer data + added unit tests.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants