Skip to content

fix tracking events#589

Merged
goldflag merged 6 commits intomasterfrom
debug
Sep 15, 2025
Merged

fix tracking events#589
goldflag merged 6 commits intomasterfrom
debug

Conversation

@goldflag
Copy link
Copy Markdown
Collaborator

@goldflag goldflag commented Sep 15, 2025

Summary by CodeRabbit

  • New Features
    • Config-driven bot blocking and IP exclusion for more accurate filtering.
    • Clear 404 when a site is not found; site config lookup used for siteId.
  • Bug Fixes
    • Prevented invalid site ID issues; monthly usage limits enforced consistently.
    • API now returns actual error payloads for failures.
  • Refactor
    • Removed origin validation; validation now driven by site configuration.
    • Standardized numeric siteId and updated userId/session handling accordingly.
  • Tests
    • Origin validation test suite removed to match new flow.

- Updated SessionReplayIngestService, SessionsService, and trackEvent to handle site_id as a number instead of a string.
- Enhanced session management and event tracking by ensuring consistent numeric site ID usage.
- Improved site configuration retrieval and validation for better error handling and logging.
- Removed the `validateOrigin` function and its related logic from `recordSessionReplay` and `trackEvent` services, as origin validation is currently disabled.
- Cleaned up the `requestValidation` tests by removing tests related to origin validation, enhancing code clarity and maintainability.
- Updated import statements to reflect the removal of unused functions, ensuring a more streamlined codebase.
- Updated the `recordEvents` method to accept either a numeric or string site ID, improving flexibility in site configuration retrieval.
- Integrated site configuration fetching to ensure valid site IDs are used, enhancing error handling for non-existent sites.
- Streamlined user ID generation by ensuring consistent handling of site IDs across services.
- Updated `SessionReplayIngestService`, `SessionsService`, and `trackEvent` to standardize the use of `siteId` instead of `site_id`, enhancing code clarity and maintainability.
- Improved parameter destructuring in `updateSession` method for better readability.
- Ensured consistent handling of site IDs across services to streamline session management and event tracking.
- Updated `recordSessionReplay` to fetch and validate site configuration using the numeric site ID, improving error handling for non-existent sites.
- Refactored `recordEvents` method in `SessionReplayIngestService` to accept a numeric site ID directly, streamlining site ID management.
- Improved error logging by returning the error object instead of a generic message, enhancing debugging capabilities.
@vercel
Copy link
Copy Markdown

vercel bot commented Sep 15, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
rybbit Ready Ready Preview Comment Sep 15, 2025 5:06am

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Sep 15, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Retrieve site configuration to obtain a numeric siteId and drive bot, monthly-limit, and IP-exclusion checks from that config; remove origin validation and related tests; propagate siteId:number through tracker, payload creation, userId generation, sessions, and replay ingestion; adjust error responses minimally.

Changes

Cohort / File(s) Summary
Site config & numeric siteId propagation
server/src/services/tracker/trackEvent.ts, server/src/services/tracker/utils.ts, server/src/services/tracker/types.ts, server/src/services/userId/userIdService.ts, server/src/services/sessions/sessionsService.ts, server/src/services/replay/sessionReplayIngestService.ts
Replace origin checks with siteConfig.getSiteConfig(...) lookups; propagate numeric siteId everywhere (TrackingPayload.site_id: number); createBasePayload gains a numeric siteId param; generateUserId now requires numeric siteId; sessionsService signatures changed to accept siteId: number and updateSession uses { userId, siteId }; replay ingest and tracker code updated to pass numeric siteId.
Endpoint control flow and error handling
server/src/api/sessionReplay/recordSessionReplay.ts
Obtain site config asynchronously and fail early if missing; continue monthly usage, API key, and IP-exclusion checks using numeric siteId; removed origin validation branch; catch now returns { error } payload.
Origin validation removal & tests
server/src/services/shared/requestValidation.ts, server/src/services/shared/requestValidation.test.ts
Removed exported validateOrigin and all origin-related logic; deleted origin-focused tests and normalizeOrigin mocks; retained API key validation and rate-limit tests.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant API as trackEvent (API)
  participant SC as siteConfig
  participant Val as Validation
  participant Sess as sessionsService
  participant Q as Queue/Storage

  C->>API: POST /track (payload)
  API->>SC: getSiteConfig(site)
  SC-->>API: { siteId:number, blockBots, excludedIPs, monthlyLimit... } or null
  alt Site not found
    API-->>C: 404
  else Site found
    API->>Val: bot check / monthly limit / IP exclusion
    alt Blocked or over limit
      API-->>C: 204/429
    else Proceed
      API->>API: createBasePayload(..., numericSiteId)
      API->>Sess: updateSession({ userId, siteId:number })
      Sess-->>API: { sessionId }
      API->>Q: enqueue tracking events
      Q-->>API: ack
      API-->>C: 200
    end
  end
Loading
sequenceDiagram
  autonumber
  participant C as Client
  participant API as recordSessionReplay (API)
  participant SC as siteConfig
  participant Val as Validation
  participant SR as sessionReplayIngestService

  C->>API: POST /session-replay/:site
  API->>SC: getSiteConfig(params.site)
  SC-->>API: { siteId:number } or null
  alt Site not found
    API-->>C: 404 "Site not found"
  else Site found
    API->>Val: monthly usage / API key / IP exclusion
    alt Validation fails
      API-->>C: 403/429
    else
      API->>SR: ingest(..., siteId:number)
      SR-->>API: ok
      API-->>C: 200
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • replace monotomic siteId with random id #588 — Overlapping changes to async siteConfig lookups, numeric siteId handling, and removal of ensureInitialized/siteId string usage.
  • Exclude ips #527 — Overlaps on IP exclusion and siteConfig-driven request short-circuit logic in tracking/replay flows.
  • Track outbound events #536 — Modifies createBasePayload alongside event-type handling; intersects with the signature change to accept numeric siteId.

Poem

A rabbit taps the keys with cheer,
No more Origin checks — the path is clear.
Numeric burrows guide each hop,
Sessions hop, events don’t stop.
Carrots queued, the logs align — thump-thump, fine code here!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "fix tracking events" is directly related to the changeset because the PR primarily updates tracking flows (trackEvent, createBasePayload, TrackingPayload.site_id), session handling, and siteId retrieval to correct tracking behavior; however it is generic and does not describe the primary technical changes such as migrating site_id from string to number and removing origin validation.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dc9cbb1 and 784295c.

📒 Files selected for processing (3)
  • server/src/services/replay/sessionReplayIngestService.ts (2 hunks)
  • server/src/services/tracker/utils.ts (1 hunks)
  • server/src/services/userId/userIdService.ts (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

- Updated `generateUserId` method in `UserIdService` to require a numeric site ID, enhancing type safety.
- Simplified calls to `generateUserId` in `SessionReplayIngestService` and `createBasePayload` by removing unnecessary string conversion for site ID.
- Improved consistency in user ID generation across services, aligning with recent site ID handling updates.
@goldflag goldflag merged commit 9a970ff into master Sep 15, 2025
5 of 6 checks passed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
server/src/api/sessionReplay/recordSessionReplay.ts (1)

53-63: Require API key for non-public sites; keep public sites open.

Without origin checks, enforce auth for private sites to prevent spoofed ingestion.

     // First check if API key is provided and valid
     const apiKeyValidation = await validateApiKey(siteId, body.apiKey);

     // If API key validation failed with an error, reject the request
     if (apiKeyValidation.error) {
       logger.warn(`[SessionReplay] Request rejected for site ${siteId}: ${apiKeyValidation.error}`);
       return reply.status(403).send({
         success: false,
         error: apiKeyValidation.error,
       });
     }
+
+    // No API key provided or not validated: only allow if site is public
+    if (!apiKeyValidation.success && !body.apiKey) {
+      if (!site.public) {
+        return reply.status(403).send({
+          success: false,
+          error: "API key required for this site",
+        });
+      }
+    }
server/src/services/sessions/sessionsService.ts (1)

1-6: Import crypto explicitly; avoid runtime ReferenceError

crypto.randomUUID() isn’t guaranteed to exist on the global in all Node runtimes. Import from node:crypto and call randomUUID() to be safe.

 import { and, eq, lt } from "drizzle-orm";
 import * as cron from "node-cron";
+import { randomUUID } from "node:crypto";
 import { db } from "../../db/postgres/postgres.js";
 import { activeSessions } from "../../db/postgres/schema.js";
 import { createServiceLogger } from "../../lib/logger/logger.js";
@@
-    const insertData = {
-      sessionId: crypto.randomUUID(),
+    const insertData = {
+      sessionId: randomUUID(),
       siteId,
       userId,

Also applies to: 52-55

🧹 Nitpick comments (9)
server/src/services/tracker/types.ts (1)

2-2: Naming consistency: keep snake_case only if it matches external schema.

Guidelines prefer camelCase. If this mirrors CH column names or external wire format, keep as-is; otherwise consider siteId for internal TS types.

server/src/services/tracker/utils.ts (2)

101-107: Ensure numeric site_id overrides any client-provided value.

Spread order correctly overwrites validatedBody.site_id. Consider explicitly omitting site_id from the spread to prevent accidental future reordering.

-  return {
-    ...validatedBody,
-    site_id: numericSiteId,
+  const { site_id: _ignored, ...rest } = validatedBody as any;
+  return {
+    ...rest,
+    site_id: numericSiteId,

Also applies to: 103-103


39-40: Prefer centralized logger over console.error for parsing failures.

Use the server logger for structured logs and consistent sinks.

-  } catch (e) {
-    console.error("Error parsing query string:", e);
+  } catch (e) {
+    request.log?.warn({ err: e }, "Error parsing query string");
server/src/api/sessionReplay/recordSessionReplay.ts (1)

46-49: Minor: avoid unnecessary Number() conversion.

siteId is already number; use it directly.

-    if (usageService.isSiteOverLimit(Number(siteId))) {
+    if (usageService.isSiteOverLimit(siteId)) {
server/src/services/shared/requestValidation.test.ts (2)

28-30: Remove stale mock for normalizeOrigin.

Origin validation is gone; this mock is unused and can confuse readers.

-vi.mock("../../utils.js", () => ({
-  normalizeOrigin: vi.fn(),
-}));

144-156: Error-path coverage is good; consider adding test for private-site gating (route-level).

Once gating is added in recordSessionReplay, add an integration/unit test asserting 403 on private sites without API key.

I can draft a test in server/src/api/sessionReplay/recordSessionReplay.test.ts that:

  • Mocks siteConfig.getSiteConfig to { public: false, siteId: 1 }
  • Sends request without apiKey
  • Expects 403 with "API key required for this site"
server/src/services/replay/sessionReplayIngestService.ts (1)

9-9: Remove unused siteConfig import.

Dead import; keep file clean.

-import { siteConfig } from "../../lib/siteConfig.js";
server/src/services/tracker/trackEvent.ts (2)

305-310: Optional: dedupe queue items to curb bursty duplicates

If idempotency is desirable, consider a deterministic jobId or a short TTL dedupe key.


280-289: Use the same IP helper (getIpAddress); siteConfig.isIPExcluded accepts string|number

Replace request.ip with getIpAddress(request) to avoid proxy/header mismatches (other callers like sessionReplay already use it). siteConfig.isIPExcluded(ip, siteIdOrId?: string|number) — passing validatedPayload.site_id is fine.

-import { createBasePayload } from "./utils.js";
+import { createBasePayload, getIpAddress } from "./utils.js";
@@
-    const requestIP = validatedPayload.ip_address || request.ip || "";
+    const requestIP = validatedPayload.ip_address || getIpAddress(request);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f99b9e and dc9cbb1.

📒 Files selected for processing (8)
  • server/src/api/sessionReplay/recordSessionReplay.ts (2 hunks)
  • server/src/services/replay/sessionReplayIngestService.ts (2 hunks)
  • server/src/services/sessions/sessionsService.ts (2 hunks)
  • server/src/services/shared/requestValidation.test.ts (6 hunks)
  • server/src/services/shared/requestValidation.ts (0 hunks)
  • server/src/services/tracker/trackEvent.ts (4 hunks)
  • server/src/services/tracker/types.ts (1 hunks)
  • server/src/services/tracker/utils.ts (1 hunks)
💤 Files with no reviewable changes (1)
  • server/src/services/shared/requestValidation.ts
🧰 Additional context used
📓 Path-based instructions (2)
{client,server}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

{client,server}/**/*.{ts,tsx}: Use TypeScript with strict typing throughout both client and server
Use try/catch blocks with specific error types for error handling
Use camelCase for variables and functions, PascalCase for components and types
Group imports by external, then internal, and sort alphabetically within groups

Files:

  • server/src/api/sessionReplay/recordSessionReplay.ts
  • server/src/services/tracker/types.ts
  • server/src/services/tracker/utils.ts
  • server/src/services/sessions/sessionsService.ts
  • server/src/services/tracker/trackEvent.ts
  • server/src/services/replay/sessionReplayIngestService.ts
  • server/src/services/shared/requestValidation.test.ts
server/**/*

📄 CodeRabbit inference engine (CLAUDE.md)

Backend: Use Fastify, Drizzle ORM (Postgres), ClickHouse, and Zod

Files:

  • server/src/api/sessionReplay/recordSessionReplay.ts
  • server/src/services/tracker/types.ts
  • server/src/services/tracker/utils.ts
  • server/src/services/sessions/sessionsService.ts
  • server/src/services/tracker/trackEvent.ts
  • server/src/services/replay/sessionReplayIngestService.ts
  • server/src/services/shared/requestValidation.test.ts
🧬 Code graph analysis (4)
server/src/api/sessionReplay/recordSessionReplay.ts (1)
server/src/lib/siteConfig.ts (1)
  • siteConfig (307-307)
server/src/services/tracker/utils.ts (2)
server/src/utils.ts (1)
  • getIpAddress (205-226)
server/src/services/userId/userIdService.ts (1)
  • userIdService (68-68)
server/src/services/sessions/sessionsService.ts (2)
server/src/db/postgres/postgres.ts (1)
  • db (20-20)
server/src/db/postgres/schema.ts (1)
  • activeSessions (86-92)
server/src/services/tracker/trackEvent.ts (2)
server/src/lib/siteConfig.ts (1)
  • siteConfig (307-307)
server/src/services/tracker/utils.ts (1)
  • createBasePayload (85-118)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build Client Images (ubuntu-24.04-arm, linux/arm64)
  • GitHub Check: Build Client Images (ubuntu-latest, linux/amd64)
🔇 Additional comments (11)
server/src/services/shared/requestValidation.test.ts (1)

62-85: Good coverage for valid key path.

Happy path assertions and console spy cleanup look solid.

server/src/services/replay/sessionReplayIngestService.ts (2)

34-39: UserId generation salt: correct to stringify siteId.

Alignment with other call sites; no further action.


41-44: Session update signature change: LGTM.

Switch to { userId, siteId } with numeric siteId aligns with sessionsService updates.

server/src/services/tracker/types.ts (1)

2-2: Numeric site_id: confirm end-to-end type alignment (DB, ClickHouse, Drizzle, payloads).

Change looks correct; automated grep in the sandbox returned "No files were searched" — verify that ClickHouse column types, Drizzle/ORM models, DB migrations, and any JSON/TypeScript schemas use a numeric site_id (UInt32/UInt64, number, or bigint) and that no interfaces/migrations still use string to avoid implicit casts.

Suggested repo checks (run from project root):

rg -uu -nP 'site_id.*UInt\d+' -g '!**/node_modules/**'
rg -uu -nP '\bsite_id\s*:\s*string\b' --type=ts -g '!**/node_modules/**'
rg -uu -nP '\bsite_id\b' -g '!**/node_modules/**'
server/src/services/tracker/utils.ts (1)

97-100: UserId generation salt update looks correct.

Passing String(numericSiteId) preserves the previous contract (salt as string).

Automated search couldn't verify — ripgrep returned "No files were searched". Re-run and paste results: rg -n "generateUserId" -S -g "!/node_modules/" (or run rg --debug "generateUserId" to see why files are skipped).

server/src/services/sessions/sessionsService.ts (1)

28-33: Signature change to numeric siteId looks correct

Using siteId: number aligns with the Postgres integer("site_id") column.

server/src/services/tracker/trackEvent.ts (5)

7-7: Import usage LGTM

Both validateApiKey and checkApiKeyRateLimit are used below.


258-268: Bot-blocking toggle is correctly honored and bypassed for API-keyed calls

Using isbot with site-level blockBots is appropriate.


270-275: Monthly limit check moved to numeric siteId: good

Early exit before enqueuing reduces load.


295-297: Passing numeric siteId into createBasePayload aligns types end-to-end

Keeps TotalTrackingPayload.site_id numeric downstream.


299-304: Session update call matches new signature

{ userId: payload.userId, siteId: siteConfiguration.siteId } is correct.

Comment on lines +39 to 44
// Get the site configuration to get the numeric siteId
const siteId = (await siteConfig.getSiteConfig(request.params.site))?.siteId;
if (!siteId) {
throw new Error(`Site not found: ${request.params.site}`);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fetch full site config and gate unauthenticated requests; avoid falsey check on 0.

Origin validation removal makes the endpoint accept unauthenticated requests globally. Gate by site.public (or equivalent) when no valid API key. Also avoid !siteId which rejects 0.

-    // Get the site configuration to get the numeric siteId
-    const siteId = (await siteConfig.getSiteConfig(request.params.site))?.siteId;
-    if (!siteId) {
-      throw new Error(`Site not found: ${request.params.site}`);
-    }
+    // Resolve site and numeric siteId
+    const site = await siteConfig.getSiteConfig(request.params.site);
+    if (!site) {
+      throw new Error(`Site not found: ${request.params.site}`);
+    }
+    const siteId = site.siteId;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Get the site configuration to get the numeric siteId
const siteId = (await siteConfig.getSiteConfig(request.params.site))?.siteId;
if (!siteId) {
throw new Error(`Site not found: ${request.params.site}`);
}
// Resolve site and numeric siteId
const site = await siteConfig.getSiteConfig(request.params.site);
if (!site) {
throw new Error(`Site not found: ${request.params.site}`);
}
const siteId = site.siteId;

Comment on lines +111 to 112
return reply.status(500).send({ error });
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Don’t return raw Error to clients.

Leak risk and inconsistent API. Return a generic message and log details.

-    logger.error(error as Error, "Error recording session replay");
-    return reply.status(500).send({ error });
+    logger.error(error as Error, "Error recording session replay");
+    return reply.status(500).send({ error: "Internal server error" });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return reply.status(500).send({ error });
}
logger.error(error as Error, "Error recording session replay");
return reply.status(500).send({ error: "Internal server error" });
}
🤖 Prompt for AI Agents
In server/src/api/sessionReplay/recordSessionReplay.ts around lines 111-112, the
handler currently returns the raw Error object to clients (return
reply.status(500).send({ error })), which leaks internals and creates
inconsistent API responses; change it to log the full error server-side (with
request context or generated error id) and send a fixed, minimal JSON error
response to the client (e.g., { error: "Internal server error" } optionally with
an opaque errorId), keeping HTTP 500, and ensure logs include the original error
and any correlation id for debugging.

Comment on lines +28 to 36
async getExistingSession(userId: string, siteId: number) {
const [existingSession] = await db
.select()
.from(activeSessions)
.where(and(eq(activeSessions.userId, userId), eq(activeSessions.siteId, siteIdNumber)))
.where(and(eq(activeSessions.userId, userId), eq(activeSessions.siteId, siteId)))
.limit(1);

return existingSession || null;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Eliminate race creating duplicate active sessions; make insert/upsert atomic

TOCTOU: two concurrent requests for the same (siteId, userId) can both see “no session” and insert, creating duplicates. Also, getExistingSession(...).limit(1) has no deterministic ordering if duplicates exist.

  • Add a unique constraint on (user_id, site_id) and make siteId/userId non-null.
  • Replace read-then-write with a single INSERT ... ON CONFLICT DO UPDATE returning session_id.

Schema (outside this file):

 // server/src/db/postgres/schema.ts
 export const activeSessions = pgTable("active_sessions", {
   sessionId: text("session_id").primaryKey().notNull(),
-  siteId: integer("site_id"),
-  userId: text("user_id"),
+  siteId: integer("site_id").notNull(),
+  userId: text("user_id").notNull(),
   startTime: timestamp("start_time").defaultNow(),
   lastActivity: timestamp("last_activity").defaultNow(),
-});
+}, (table) => {
+  return {
+    userSiteUnique: uniqueIndex("active_sessions_user_site_unique").on(table.userId, table.siteId),
+  };
+});

Atomic upsert:

-  async updateSession({ userId, siteId }: { userId: string; siteId: number }): Promise<{ sessionId: string }> {
-    const existingSession = await this.getExistingSession(userId, siteId);
-
-    if (existingSession) {
-      await db
-        .update(activeSessions)
-        .set({ lastActivity: new Date() })
-        .where(eq(activeSessions.sessionId, existingSession.sessionId));
-      return { sessionId: existingSession.sessionId };
-    }
-
-    const insertData = {
-      sessionId: randomUUID(),
-      siteId,
-      userId,
-      startTime: new Date(),
-      lastActivity: new Date(),
-    };
-    await db.insert(activeSessions).values(insertData);
-    return { sessionId: insertData.sessionId };
-  }
+  async updateSession({ userId, siteId }: { userId: string; siteId: number }): Promise<{ sessionId: string }> {
+    const now = new Date();
+    const [{ sessionId }] = await db
+      .insert(activeSessions)
+      .values({
+        sessionId: randomUUID(),
+        siteId,
+        userId,
+        startTime: now,
+        lastActivity: now,
+      })
+      .onConflictDoUpdate({
+        target: [activeSessions.userId, activeSessions.siteId],
+        set: { lastActivity: now },
+      })
+      .returning({ sessionId: activeSessions.sessionId });
+    return { sessionId };
+  }

Verification script to confirm no existing uniqueness:

Also applies to: 38-61


🏁 Script executed:

#!/bin/bash
rg -nP -C2 'pgTable\(\s*"active_sessions"|unique(Index)?\(' server/src/db

Length of output: 1299


Eliminate race creating duplicate active sessions; make insert/upsert atomic

active_sessions (server/src/db/postgres/schema.ts) currently has siteId nullable and no unique index on (user_id, site_id) — the read-then-write in server/src/services/sessions/sessionsService.ts (getExistingSession + insert) can create duplicate active sessions under concurrency. Add a unique constraint and NOT NULLs, and replace the read-then-write with a single INSERT ... ON CONFLICT DO UPDATE ... RETURNING session_id. Backfill/fix existing nulls/duplicates before applying schema constraints.

Suggested changes (apply to the files above):

Schema:

 // server/src/db/postgres/schema.ts
 export const activeSessions = pgTable("active_sessions", {
   sessionId: text("session_id").primaryKey().notNull(),
-  siteId: integer("site_id"),
-  userId: text("user_id"),
+  siteId: integer("site_id").notNull(),
+  userId: text("user_id").notNull(),
   startTime: timestamp("start_time").defaultNow(),
   lastActivity: timestamp("last_activity").defaultNow(),
-});
+}, (table) => {
+  return {
+    userSiteUnique: uniqueIndex("active_sessions_user_site_unique").on(table.userId, table.siteId),
+  };
+});

Atomic upsert in sessionsService:

-  async updateSession({ userId, siteId }: { userId: string; siteId: number }): Promise<{ sessionId: string }> {
-    const existingSession = await this.getExistingSession(userId, siteId);
-
-    if (existingSession) {
-      await db
-        .update(activeSessions)
-        .set({ lastActivity: new Date() })
-        .where(eq(activeSessions.sessionId, existingSession.sessionId));
-      return { sessionId: existingSession.sessionId };
-    }
-
-    const insertData = {
-      sessionId: randomUUID(),
-      siteId,
-      userId,
-      startTime: new Date(),
-      lastActivity: new Date(),
-    };
-    await db.insert(activeSessions).values(insertData);
-    return { sessionId: insertData.sessionId };
-  }
+  async updateSession({ userId, siteId }: { userId: string; siteId: number }): Promise<{ sessionId: string }> {
+    const now = new Date();
+    const [{ sessionId }] = await db
+      .insert(activeSessions)
+      .values({
+        sessionId: randomUUID(),
+        siteId,
+        userId,
+        startTime: now,
+        lastActivity: now,
+      })
+      .onConflictDoUpdate({
+        target: [activeSessions.userId, activeSessions.siteId],
+        set: { lastActivity: now },
+      })
+      .returning({ sessionId: activeSessions.sessionId });
+    return { sessionId };
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async getExistingSession(userId: string, siteId: number) {
const [existingSession] = await db
.select()
.from(activeSessions)
.where(and(eq(activeSessions.userId, userId), eq(activeSessions.siteId, siteIdNumber)))
.where(and(eq(activeSessions.userId, userId), eq(activeSessions.siteId, siteId)))
.limit(1);
return existingSession || null;
}
export const activeSessions = pgTable("active_sessions", {
sessionId: text("session_id").primaryKey().notNull(),
siteId: integer("site_id").notNull(),
userId: text("user_id").notNull(),
startTime: timestamp("start_time").defaultNow(),
lastActivity: timestamp("last_activity").defaultNow(),
}, (table) => {
return {
userSiteUnique: uniqueIndex("active_sessions_user_site_unique").on(table.userId, table.siteId),
};
});
Suggested change
async getExistingSession(userId: string, siteId: number) {
const [existingSession] = await db
.select()
.from(activeSessions)
.where(and(eq(activeSessions.userId, userId), eq(activeSessions.siteId, siteIdNumber)))
.where(and(eq(activeSessions.userId, userId), eq(activeSessions.siteId, siteId)))
.limit(1);
return existingSession || null;
}
async getExistingSession(userId: string, siteId: number) {
const [existingSession] = await db
.select()
.from(activeSessions)
.where(and(eq(activeSessions.userId, userId), eq(activeSessions.siteId, siteId)))
.limit(1);
return existingSession || null;
}
async updateSession({ userId, siteId }: { userId: string; siteId: number }): Promise<{ sessionId: string }> {
const now = new Date();
const [{ sessionId }] = await db
.insert(activeSessions)
.values({
sessionId: randomUUID(),
siteId,
userId,
startTime: now,
lastActivity: now,
})
.onConflictDoUpdate({
target: [activeSessions.userId, activeSessions.siteId],
set: { lastActivity: now },
})
.returning({ sessionId: activeSessions.sessionId });
return { sessionId };
}
🤖 Prompt for AI Agents
In server/src/services/sessions/sessionsService.ts around lines 28 to 36, the
current read-then-write flow (getExistingSession + insert) can create duplicate
active sessions under concurrency; replace it with a single atomic upsert:
change the DB schema (server/src/db/postgres/schema.ts) to make siteId NOT NULL
and add a unique constraint/index on (user_id, site_id) after backfilling/fixing
existing nulls and duplicate rows, then modify the service to perform one INSERT
... ON CONFLICT (user_id, site_id) DO UPDATE SET ... RETURNING session_id (or
the appropriate columns) instead of selecting first; ensure the returned row is
used and remove the separate getExistingSession call so session creation/upsert
is atomic.

Comment on lines +246 to +254
// Get the site configuration to get the numeric siteId
const siteConfiguration = await siteConfig.getSiteConfig(validatedPayload.site_id);
if (!siteConfiguration) {
logger.warn({ siteId: validatedPayload.site_id }, "Site not found");
return reply.status(404).send({
success: false,
error: "Site not found",
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Removing origin validation increases spoofing/abuse risk; add compensating controls

Without origin checks, anyone who knows a site_id can POST events unless an API key is required. At minimum, enforce one or more:

  • Per-site toggle to require API key for client events.
  • Strict CORS on the tracking endpoint to only allow configured origins.
  • Signature (HMAC) on payloads for keyless flows.

🏁 Script executed:

#!/bin/bash
# Check for CORS configuration and any origin allowlist
rg -nP -C2 '@fastify/cors|fastifyCors|cors\(' server
rg -nP -C2 'origin:\s*(\[|["'\''])' server
# Confirm if API-key-only mode exists in site config
rg -nP -C2 'require(Api)?Key|requireApiKey' server/src/lib/siteConfig.ts server/src

Length of output: 1716


Reinstate origin validation or add compensating controls (critical)

Removing origin checks lets anyone who knows a site_id POST events. Repo quick-check: @fastify/cors is imported in server/src/index.ts but I found no runtime origin allowlist or per-site requireApiKey flag; tests reference an origin in server/src/analytics-script/tracking.test.ts.

  • Validate request origin server-side in server/src/services/tracker/trackEvent.ts (around the siteConfig.getSiteConfig call — ~lines 246–254): reject when origin not in siteConfiguration.allowed_origins.
  • Add a per-site requireApiKey toggle and enforce x-api-key/Authorization for client events (store/rotate keys in site config).
  • If keyless client flows remain, require and verify an HMAC signature on payloads.
  • Configure @fastify/cors in server/src/index.ts to only allow configured origins (CORS is defense-in-depth, not a substitute for server-side checks).
  • Update tests to cover origin/API-key/HMAC enforcement.
🤖 Prompt for AI Agents
In server/src/services/tracker/trackEvent.ts around lines 246 to 254, reinstate
server-side origin validation and add per-site auth controls: after loading
siteConfiguration, verify the request origin (from headers.origin or referer) is
present in siteConfiguration.allowed_origins and return 403 if not; introduce
and respect a boolean siteConfiguration.requireApiKey — when true, require and
validate an x-api-key or Authorization bearer token against keys stored on the
site configuration (support key rotation); when requireApiKey is false but the
site allows client keyless events, require an HMAC signature header and verify
it against a site secret to authenticate payload integrity; ensure any rejection
uses appropriate 4xx responses and logs. Also configure @fastify/cors in
server/src/index.ts to only allow the configured origins as a defense-in-depth
measure and update/add tests in server/src/analytics-script/tracking.test.ts to
cover origin rejection, API key enforcement, and HMAC verification paths.

Comment on lines 85 to 90
export async function createBasePayload(
request: FastifyRequest,
eventType: "pageview" | "custom_event" | "performance" | "error" | "outbound" = "pageview",
validatedBody: ValidatedTrackingPayload
validatedBody: ValidatedTrackingPayload,
numericSiteId: number
): Promise<TotalTrackingPayload> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Signature change: verify all call sites updated to pass numericSiteId.

API break requires all invocations to add the new parameter.


🏁 Script executed:

#!/bin/bash
# Expect all createBasePayload calls to have 4 args now
rg -nP '\bcreateBasePayload\s*\(' --type=ts -g '!**/node_modules/**' -C2

Length of output: 148


🏁 Script executed:

#!/bin/bash
set -euo pipefail
echo ">>> git grep (tracked files) for createBasePayload("
git grep -n --heading --line-number "createBasePayload(" || true

echo
echo ">>> ripgrep (no ignore, limit to .ts/.tsx/.js/.jsx) for createBasePayload("
rg -nP '\bcreateBasePayload\s*\(' -S -uu -g '*.ts' -g '*.tsx' -g '*.js' -g '*.jsx' -C3 || true

echo
echo ">>> locate definition(s) of createBasePayload"
rg -nP 'export\s+async\s+function\s+createBasePayload' -S -uu -g '*.ts' -g '*.tsx' -C5 || true

echo
echo ">>> list files that reference the symbol name (broader search)"
rg -nP '\bcreateBasePayload\b' -S -uu -g '*.ts' -g '*.tsx' -g '*.js' -g '*.jsx' -l || true

Length of output: 43766


Update all createBasePayload call sites to pass numericSiteId

The exported createBasePayload in server/src/services/tracker/utils.ts now requires numericSiteId: number. server/src/services/tracker/trackEvent.ts (around line 292) still calls createBasePayload(request, validatedPayload.type, validatedPayload) — update that call to include a numeric site id (e.g. Number(validatedPayload.site_id) or the resolved site id).

🤖 Prompt for AI Agents
In server/src/services/tracker/trackEvent.ts around line 292, update the call to
createBasePayload to include the new numericSiteId argument: pass a numeric
value (for example Number(validatedPayload.site_id) or the resolved site id
variable) as the fourth parameter. Ensure the value is a number (cast/validate
if necessary) and adjust any surrounding code to handle the returned
Promise<TotalTrackingPayload> as before.

Comment on lines +88 to 90
validatedBody: ValidatedTrackingPayload,
numericSiteId: number
): Promise<TotalTrackingPayload> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid runtime circular dependency on trackEvent schema (type-only import).

ValidatedTrackingPayload depends on trackingPayloadSchema from trackEvent while trackEvent calls createBasePayload → potential ESM cycle. Use a type-only import with inline typeof to erase runtime import.

Apply:

-import { trackingPayloadSchema } from "./trackEvent.js";
+// NOTE: type-only reference to avoid runtime import cycle with trackEvent

And change the alias:

-type ValidatedTrackingPayload = z.infer<typeof trackingPayloadSchema>;
+type ValidatedTrackingPayload = z.infer<typeof import("./trackEvent.js").trackingPayloadSchema>;
🤖 Prompt for AI Agents
In server/src/services/tracker/utils.ts around lines 88-90, avoid importing the
trackEvent runtime (which causes an ESM cycle) by changing the runtime import to
a type-only import and by referring to the trackEvent schema via an inline
typeof import so the schema reference is erased at runtime; specifically replace
any runtime import of ValidatedTrackingPayload/trackEvent with "import type {
ValidatedTrackingPayload } from './trackEvent'" and where you need the schema
use an inline typeof (e.g. type Schema = typeof
import('./trackEvent').trackingPayloadSchema) so no runtime module is loaded,
and update the local alias to the type-only name (ValidatedTrackingPayload)
accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant