Skip to content

fix: correct session duration when filtering by channel#963

Merged
goldflag merged 2 commits intorybbit-io:masterfrom
seojcarlos:fix/session-duration-channel-filter
Apr 4, 2026
Merged

fix: correct session duration when filtering by channel#963
goldflag merged 2 commits intorybbit-io:masterfrom
seojcarlos:fix/session-duration-channel-filter

Conversation

@seojcarlos
Copy link
Copy Markdown
Contributor

@seojcarlos seojcarlos commented Apr 3, 2026

Summary

Fix incorrect session duration calculation when filtering analytics by channel (e.g., Organic Search, Direct).

Problem

When a user filters the overview dashboard by channel, the session duration shows values ~14x lower than expected (e.g., 21 seconds instead of ~5 minutes).

Root cause: The channel field is typically only populated on the first event of a session (the entry pageview that carries the external referrer). When the channel filter is applied at the event level (WHERE channel = 'Organic Search'), only the first event of each session matches. This causes MAX(timestamp) - MIN(timestamp) to be near zero, since only one event per session is included.

Fix

Add "channel" to DEFAULT_SESSION_LEVEL_PARAMS in getFilterStatement.ts. This changes the filter from a row-level WHERE clause to a session-level subquery:

-- Before (broken): only first event matches
AND channel = 'Organic Search'

-- After (fixed): finds sessions, then includes ALL their events
AND session_id IN (
  SELECT DISTINCT session_id FROM events
  WHERE channel = 'Organic Search'
)

This is the same pattern already used for event_name filtering.

Impact

  • Fixes session duration, bounce rate, and pages per session metrics when any channel filter is applied
  • Affects getOverview, getOverviewBucketed, and any other query using getFilterStatement with channel filters
  • No impact on unfiltered queries

Fixes #959

Summary by CodeRabbit

  • Bug Fixes
    • Corrected processing of channel filters so they apply at the session level, ensuring channel constraints match entire sessions rather than individual events.
    • Improves accuracy of analytics queries and session-level aggregations, yielding more consistent filtering and reporting in session-based views.

The channel field is typically only set on the first event of a session
(the entry pageview with the external referrer). When filtering events
by channel at the row level, only the first event matches, causing
MAX(timestamp) - MIN(timestamp) to be near zero.

Add "channel" to DEFAULT_SESSION_LEVEL_PARAMS so that the filter uses a
session-level subquery (session_id IN (SELECT DISTINCT session_id ...))
instead of a row-level WHERE clause. This ensures all events from
matching sessions are included in duration calculations.

Fixes rybbit-io#959
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 3, 2026

@seojcarlos is attempting to deploy a commit to the goldflag's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 3, 2026

📝 Walkthrough

Walkthrough

Added channel to session-level filter parameters so channel-based filters are now treated and compiled as session-level subqueries in ClickHouse queries.

Changes

Cohort / File(s) Summary
Session-level filter config
server/src/api/analytics/utils/getFilterStatement.ts
Added "channel" to DEFAULT_SESSION_LEVEL_PARAMS, making channel evaluated in the sessionLevelParams.includes(...) branch.
Sessions API usage
server/src/api/analytics/sessions/getSessions.ts
Extended the sessionLevelParams passed to getFilterStatement to include "channel", causing session-scoped channel filtering in generated queries.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Query optimization #950 — Modifies analytics filtering logic in the same files and changes how filters are classified/applied (session-level vs event-level).
  • better filters #817 — Previously added configurable sessionLevelParams/fieldMappings; directly related to session-level filtering changes.

Poem

🐰 In tunnels of SQL I nibbled a thread,
Found "channel" asleep where events once led.
Now sessions hold it, snug in their nest,
Metrics wake clearer — a hop toward the best! 🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: correct session duration when filtering by channel' accurately summarizes the main change: fixing session duration calculation when channel filters are applied.
Linked Issues check ✅ Passed The code changes add 'channel' to session-level params in both getFilterStatement.ts and getSessions.ts, directly addressing the root cause identified in #959 (channel filters applying only to first events instead of full sessions).
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing session duration calculation when filtering by channel, with no unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/src/api/analytics/utils/getFilterStatement.ts (1)

102-121: ⚠️ Potential issue | 🔴 Critical

Fix regex filter handling in buildSessionLevelSubquery.

filterTypeToOperator() returns null for regex/not_regex, but buildSessionLevelSubquery uses this value directly without checking (line 111). When regex filters are applied to session-level parameters (event_name or channel), the generated SQL becomes malformed: channel null 'pattern'.

The session-level check (line 132) executes before the regex handling logic (line 240), so regex filters never reach the proper match() function implementation. This is a pre-existing bug for event_name that now extends to channel after it was added to DEFAULT_SESSION_LEVEL_PARAMS.

Proposed fix to handle regex in session-level subqueries
   const buildSessionLevelSubquery = (
     param: FilterParameter,
     filterType: FilterType,
     values: (string | number)[],
     wildcardPrefix: string
   ): string => {
+    // Handle regex filters separately using match() function
+    if (filterType === "regex" || filterType === "not_regex") {
+      const pattern = String(values[0] ?? "");
+      const whereClause = [siteIdFilter, timeFilter].filter(Boolean).join(" AND ");
+      const matchExpr = `match(${param}, ${SqlString.escape(pattern)})`;
+      const condition = filterType === "regex" ? matchExpr : `NOT ${matchExpr}`;
+      const finalWhere = whereClause ? `WHERE ${whereClause} AND ${condition}` : `WHERE ${condition}`;
+      return `session_id IN (
+              SELECT DISTINCT session_id
+              FROM events
+              ${finalWhere}
+            )`;
+    }
+
     const whereClause = [siteIdFilter, timeFilter].filter(Boolean).join(" AND ");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/api/analytics/utils/getFilterStatement.ts` around lines 102 - 121,
buildSessionLevelSubquery currently uses filterTypeToOperator() directly which
returns null for regex/not_regex causing malformed SQL (e.g., "channel null
'pattern'"); change it to detect regex/not_regex and delegate to the existing
match(...) helper instead of using filterTypeToOperator() for those cases.
Specifically, inside buildSessionLevelSubquery, when filterType is "regex" or
"not_regex" build the condition by calling match(param, value, filterType ===
"not_regex") for each value (and OR-join multiple values) rather than using
filterTypeToOperator, so session-level params like event_name and channel (from
DEFAULT_SESSION_LEVEL_PARAMS) produce correct SQL fragments handled the same way
as other regex logic elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/src/api/analytics/utils/getFilterStatement.ts`:
- Line 16: getSessions.ts overrides sessionLevelParams with
["event_name","pathname","page_title"], which excludes "channel" and causes
inconsistent filtering vs DEFAULT_SESSION_LEVEL_PARAMS; update the custom
sessionLevelParams in getSessions.ts (the variable or argument named
sessionLevelParams used in the getSessions handler/function) to include
"channel" (e.g., add "channel" to the array) so session-level channel filtering
matches other endpoints, or alternatively add a clear comment in getSessions.ts
explaining why channel must be excluded if that behavior is intentional.

---

Outside diff comments:
In `@server/src/api/analytics/utils/getFilterStatement.ts`:
- Around line 102-121: buildSessionLevelSubquery currently uses
filterTypeToOperator() directly which returns null for regex/not_regex causing
malformed SQL (e.g., "channel null 'pattern'"); change it to detect
regex/not_regex and delegate to the existing match(...) helper instead of using
filterTypeToOperator() for those cases. Specifically, inside
buildSessionLevelSubquery, when filterType is "regex" or "not_regex" build the
condition by calling match(param, value, filterType === "not_regex") for each
value (and OR-join multiple values) rather than using filterTypeToOperator, so
session-level params like event_name and channel (from
DEFAULT_SESSION_LEVEL_PARAMS) produce correct SQL fragments handled the same way
as other regex logic elsewhere.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e79e5d95-c2be-4c66-a9d3-bd7059d87520

📥 Commits

Reviewing files that changed from the base of the PR and between 9a80805 and 56f0ffc.

📒 Files selected for processing (1)
  • server/src/api/analytics/utils/getFilterStatement.ts

getSessions.ts overrides DEFAULT_SESSION_LEVEL_PARAMS with a custom
list that was missing channel. This caused inconsistent filtering
behavior between the overview and sessions endpoints.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
server/src/api/analytics/sessions/getSessions.ts (1)

98-104: Consider updating the comment to reflect all session-level params.

The comment only mentions "pathname and page_title" but the array now includes event_name, pathname, page_title, and channel. Updating the comment would help future maintainers understand the full scope.

📝 Suggested comment update
   // Use composable filter options:
-  // - sessionLevelParams: pathname and page_title filter at session level (finds sessions that visited a page)
+  // - sessionLevelParams: event_name, pathname, page_title, and channel filter at session level (finds sessions matching these criteria)
   // - fieldMappings: CTE extracts UTM params as separate columns, so we need to map the field names
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/src/api/analytics/sessions/getSessions.ts` around lines 98 - 104, The
inline comment above the getFilterStatement call is outdated—update it to list
all session-level params currently passed (event_name, pathname, page_title,
channel) and mention that SESSION_FIELD_MAPPINGS is used for field mappings;
specifically edit the comment that precedes the getFilterStatement invocation
(and references sessionLevelParams and SESSION_FIELD_MAPPINGS) so it accurately
reflects the full array ["event_name","pathname","page_title","channel"] and the
behavior (filters applied at session level).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/src/api/analytics/sessions/getSessions.ts`:
- Around line 98-104: The inline comment above the getFilterStatement call is
outdated—update it to list all session-level params currently passed
(event_name, pathname, page_title, channel) and mention that
SESSION_FIELD_MAPPINGS is used for field mappings; specifically edit the comment
that precedes the getFilterStatement invocation (and references
sessionLevelParams and SESSION_FIELD_MAPPINGS) so it accurately reflects the
full array ["event_name","pathname","page_title","channel"] and the behavior
(filters applied at session level).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5b69ef1a-a142-4d13-a46d-7097a59b1232

📥 Commits

Reviewing files that changed from the base of the PR and between 56f0ffc and d183956.

📒 Files selected for processing (1)
  • server/src/api/analytics/sessions/getSessions.ts

@goldflag goldflag merged commit 7c40bb8 into rybbit-io:master Apr 4, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Session duration in channel filter view is inconsistent with user detail view

2 participants