fix: correct session duration when filtering by channel#963
Conversation
The channel field is typically only set on the first event of a session (the entry pageview with the external referrer). When filtering events by channel at the row level, only the first event matches, causing MAX(timestamp) - MIN(timestamp) to be near zero. Add "channel" to DEFAULT_SESSION_LEVEL_PARAMS so that the filter uses a session-level subquery (session_id IN (SELECT DISTINCT session_id ...)) instead of a row-level WHERE clause. This ensures all events from matching sessions are included in duration calculations. Fixes rybbit-io#959
|
@seojcarlos is attempting to deploy a commit to the goldflag's projects Team on Vercel. A member of the Team first needs to authorize it. |
📝 WalkthroughWalkthroughAdded Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/src/api/analytics/utils/getFilterStatement.ts (1)
102-121:⚠️ Potential issue | 🔴 CriticalFix regex filter handling in
buildSessionLevelSubquery.
filterTypeToOperator()returnsnullforregex/not_regex, butbuildSessionLevelSubqueryuses this value directly without checking (line 111). When regex filters are applied to session-level parameters (event_nameorchannel), the generated SQL becomes malformed:channel null 'pattern'.The session-level check (line 132) executes before the regex handling logic (line 240), so regex filters never reach the proper
match()function implementation. This is a pre-existing bug forevent_namethat now extends tochannelafter it was added toDEFAULT_SESSION_LEVEL_PARAMS.Proposed fix to handle regex in session-level subqueries
const buildSessionLevelSubquery = ( param: FilterParameter, filterType: FilterType, values: (string | number)[], wildcardPrefix: string ): string => { + // Handle regex filters separately using match() function + if (filterType === "regex" || filterType === "not_regex") { + const pattern = String(values[0] ?? ""); + const whereClause = [siteIdFilter, timeFilter].filter(Boolean).join(" AND "); + const matchExpr = `match(${param}, ${SqlString.escape(pattern)})`; + const condition = filterType === "regex" ? matchExpr : `NOT ${matchExpr}`; + const finalWhere = whereClause ? `WHERE ${whereClause} AND ${condition}` : `WHERE ${condition}`; + return `session_id IN ( + SELECT DISTINCT session_id + FROM events + ${finalWhere} + )`; + } + const whereClause = [siteIdFilter, timeFilter].filter(Boolean).join(" AND ");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/src/api/analytics/utils/getFilterStatement.ts` around lines 102 - 121, buildSessionLevelSubquery currently uses filterTypeToOperator() directly which returns null for regex/not_regex causing malformed SQL (e.g., "channel null 'pattern'"); change it to detect regex/not_regex and delegate to the existing match(...) helper instead of using filterTypeToOperator() for those cases. Specifically, inside buildSessionLevelSubquery, when filterType is "regex" or "not_regex" build the condition by calling match(param, value, filterType === "not_regex") for each value (and OR-join multiple values) rather than using filterTypeToOperator, so session-level params like event_name and channel (from DEFAULT_SESSION_LEVEL_PARAMS) produce correct SQL fragments handled the same way as other regex logic elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/src/api/analytics/utils/getFilterStatement.ts`:
- Line 16: getSessions.ts overrides sessionLevelParams with
["event_name","pathname","page_title"], which excludes "channel" and causes
inconsistent filtering vs DEFAULT_SESSION_LEVEL_PARAMS; update the custom
sessionLevelParams in getSessions.ts (the variable or argument named
sessionLevelParams used in the getSessions handler/function) to include
"channel" (e.g., add "channel" to the array) so session-level channel filtering
matches other endpoints, or alternatively add a clear comment in getSessions.ts
explaining why channel must be excluded if that behavior is intentional.
---
Outside diff comments:
In `@server/src/api/analytics/utils/getFilterStatement.ts`:
- Around line 102-121: buildSessionLevelSubquery currently uses
filterTypeToOperator() directly which returns null for regex/not_regex causing
malformed SQL (e.g., "channel null 'pattern'"); change it to detect
regex/not_regex and delegate to the existing match(...) helper instead of using
filterTypeToOperator() for those cases. Specifically, inside
buildSessionLevelSubquery, when filterType is "regex" or "not_regex" build the
condition by calling match(param, value, filterType === "not_regex") for each
value (and OR-join multiple values) rather than using filterTypeToOperator, so
session-level params like event_name and channel (from
DEFAULT_SESSION_LEVEL_PARAMS) produce correct SQL fragments handled the same way
as other regex logic elsewhere.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e79e5d95-c2be-4c66-a9d3-bd7059d87520
📒 Files selected for processing (1)
server/src/api/analytics/utils/getFilterStatement.ts
getSessions.ts overrides DEFAULT_SESSION_LEVEL_PARAMS with a custom list that was missing channel. This caused inconsistent filtering behavior between the overview and sessions endpoints. Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
server/src/api/analytics/sessions/getSessions.ts (1)
98-104: Consider updating the comment to reflect all session-level params.The comment only mentions "pathname and page_title" but the array now includes
event_name,pathname,page_title, andchannel. Updating the comment would help future maintainers understand the full scope.📝 Suggested comment update
// Use composable filter options: - // - sessionLevelParams: pathname and page_title filter at session level (finds sessions that visited a page) + // - sessionLevelParams: event_name, pathname, page_title, and channel filter at session level (finds sessions matching these criteria) // - fieldMappings: CTE extracts UTM params as separate columns, so we need to map the field names🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/src/api/analytics/sessions/getSessions.ts` around lines 98 - 104, The inline comment above the getFilterStatement call is outdated—update it to list all session-level params currently passed (event_name, pathname, page_title, channel) and mention that SESSION_FIELD_MAPPINGS is used for field mappings; specifically edit the comment that precedes the getFilterStatement invocation (and references sessionLevelParams and SESSION_FIELD_MAPPINGS) so it accurately reflects the full array ["event_name","pathname","page_title","channel"] and the behavior (filters applied at session level).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@server/src/api/analytics/sessions/getSessions.ts`:
- Around line 98-104: The inline comment above the getFilterStatement call is
outdated—update it to list all session-level params currently passed
(event_name, pathname, page_title, channel) and mention that
SESSION_FIELD_MAPPINGS is used for field mappings; specifically edit the comment
that precedes the getFilterStatement invocation (and references
sessionLevelParams and SESSION_FIELD_MAPPINGS) so it accurately reflects the
full array ["event_name","pathname","page_title","channel"] and the behavior
(filters applied at session level).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5b69ef1a-a142-4d13-a46d-7097a59b1232
📒 Files selected for processing (1)
server/src/api/analytics/sessions/getSessions.ts
Summary
Fix incorrect session duration calculation when filtering analytics by channel (e.g., Organic Search, Direct).
Problem
When a user filters the overview dashboard by channel, the session duration shows values ~14x lower than expected (e.g., 21 seconds instead of ~5 minutes).
Root cause: The
channelfield is typically only populated on the first event of a session (the entry pageview that carries the external referrer). When the channel filter is applied at the event level (WHERE channel = 'Organic Search'), only the first event of each session matches. This causesMAX(timestamp) - MIN(timestamp)to be near zero, since only one event per session is included.Fix
Add
"channel"toDEFAULT_SESSION_LEVEL_PARAMSingetFilterStatement.ts. This changes the filter from a row-levelWHEREclause to a session-level subquery:This is the same pattern already used for
event_namefiltering.Impact
getOverview,getOverviewBucketed, and any other query usinggetFilterStatementwith channel filtersFixes #959
Summary by CodeRabbit