Log patterns: Log patterns related code by dawkaka · Pull Request #300 · monoscope-tech/monoscope

dawkaka · 2026-01-22T19:09:30Z

Saving and presenting log patterns in ui and log patterns anomaly detection

Unusual increase in log patterns within a time window
New detected log patterns

Closes #

How to test

tonyalaribe · 2026-01-22T20:24:46Z

src/Models/Apis/LogPatterns.hs

+
+
+-- | Get a pattern by ID
+getLogPatternById :: DB es => LogPatternId -> Eff es (Maybe LogPattern)


use the _selectWhere pattern, instead of enumerating the fields one by one

tonyalaribe · 2026-01-22T20:26:55Z

src/Models/Apis/Issues/Enhancement.hs

+        Issues.LogPattern ->
+          "Describe this log pattern issue and its implications.\n"
+            <> "Title: "
+            <> issue.title
+            <> "\n"
+            <> "Service: "
+            <> fromMaybe "unknown-service" issue.service
+        Issues.LogPatternRateChange ->
+          "Describe this log pattern rate change and its implications.\n"
+            <> "Title: "
+            <> issue.title
+            <> "\n"
+            <> "Service: "
+            <> fromMaybe "unknown-service" issue.service



Please use the [text|] or any other quasiquote. so its easier to visually see the pattern of this message without the haskell semigroup noise

tonyalaribe · 2026-01-22T23:43:18Z

src/Models/Apis/LogPatterns.hs

+        SELECT
+          lp.id,
+          lp.project_id,
+          lp.log_pattern,
+          lp.pattern_hash,
+          lp.baseline_state,
+          lp.baseline_volume_hourly_mean,
+          lp.baseline_volume_hourly_stddev,
+          COALESCE(counts.current_count, 0)::INT AS current_hour_count
+        FROM apis.log_patterns lp
+        LEFT JOIN (
+          SELECT log_pattern, COUNT(*) AS current_count
+          FROM otel_logs_and_spans
+          WHERE project_id = ?
+            AND timestamp >= date_trunc('hour', NOW())
+            AND log_pattern IS NOT NULL
+          GROUP BY log_pattern
+        ) counts ON counts.log_pattern = lp.log_pattern
+        WHERE lp.project_id = ?
+          AND lp.state != 'ignored' AND lp.baseline_state = 'established'


How is log_pattern supposed to join on otel_logs_and_spans when they're not in the same database?

Or is log_patterns supposed to be a timeseries table in timefusion as well? if thats the case then you cant make queries on timeseries tables that dont depend on timestamp range

Oh and you never join on two time series tables. some databases might support the join operation, but the performance is always horrible in that case

tonyalaribe · 2026-01-22T23:43:48Z

src/Models/Apis/RequestDumps.hs

+            [text|
+              SELECT lp.log_pattern, count(*) as p_count
+              FROM apis.log_patterns lp
+              INNER JOIN otel_logs_and_spans ols
+                ON lp.log_pattern = ols.log_pattern AND lp.project_id::text = ols.project_id
+              WHERE lp.project_id = ?
+                AND lp.state != 'ignored'
+                AND ${whereCondition}
+              GROUP BY lp.log_pattern
+              ORDER BY p_count DESC
+              OFFSET ? LIMIT 15
+            |]


same as here. how is log_pattern joining on otel_logs_and_spans?

otel_logs_and_spans has a log_pattern column

Is log_pattern going to be in timefusion or postgres? if its in timefusion, you do joins. And you can't query it without time range being part of the query.

tonyalaribe · 2026-01-22T23:45:32Z

src/Models/Apis/LogPatterns.hs

+
+-- | Get pattern stats from otel_logs_and_spans
+-- Returns median and MAD (Median Absolute Deviation) for robust baseline calculation
+getPatternStats :: DB es => Projects.ProjectId -> Text -> Int -> Eff es (Maybe PatternStats)


why are you not using our widgets or atleast KQL for stats and numbers? isnt this for display?

This is for detecting spikes

tonyalaribe · 2026-01-22T23:47:09Z

src/BackgroundJobs.hs

+
+-- | Calculate baselines for log patterns
+-- Uses hourly counts from otel_logs_and_spans over the last 7 days
+calculateLogPatternBaselines :: Projects.ProjectId -> ATBackgroundCtx ()


I don't want to support magic alerts. If we're detecting spikes, we should implement an alert system that our users can enable on any metric as well, so its the same code anad logic for all cases. Not magic logic we run in the background.

tonyalaribe · 2026-01-26T21:40:58Z

src/Models/Apis/Issues.hs

      [text|
      SELECT id, created_at, updated_at, project_id, issue_type::text, endpoint_hash, acknowledged_at, acknowledged_by, archived_at, title, service, critical,
-        CASE WHEN critical THEN 'critical' ELSE 'info' END, affected_requests, affected_clients, NULL::double precision,
+        CASE WHEN critical THEN 'critical' ELSE 'info' END, 0::int, 0::int, NULL::double precision,


Why did you set these to 0?

Those are not part of the new issues table.

Why are we adding it to the query?

tonyalaribe · 2026-01-26T21:50:05Z

src/Models/Apis/Issues/Enhancement.hs

+        Issues.LogPattern ->
+          "Generate a concise title for this log pattern issue.\n"
+            <> "Title: "
+            <> issue.title
+            <> "\n"
+            <> "Service: "
+            <> fromMaybe "unknown-service" issue.service
+        Issues.LogPatternRateChange ->
+          "Generate a concise title for this log pattern rate change.\n"
+            <> "Title: "
+            <> issue.title
+            <> "\n"
+            <> "Service: "
+            <> fromMaybe "unknown-service" issue.service


Use the text quasiquotes here.

tonyalaribe · 2026-01-26T21:54:23Z

static/migrations/0031_log_patterns.sql

@@ -0,0 +1,68 @@
+BEGIN;
+
+CREATE TABLE IF NOT EXISTS apis.log_patterns (


We will support patterns on different fields not just the default body/message field. This table doesn't seem aware of this expectation.

tonyalaribe · 2026-01-26T21:55:39Z

static/migrations/0031_log_patterns.sql

+CREATE INDEX IF NOT EXISTS idx_log_patterns_last_seen ON apis.log_patterns(project_id, last_seen_at DESC);
+CREATE INDEX IF NOT EXISTS idx_log_patterns_service ON apis.log_patterns(project_id, service_name);
+
+CREATE OR REPLACE FUNCTION apis.new_log_pattern_proc() RETURNS trigger AS $$


Again, the question remains. Is this a timeseries table or regular table? If its a timeseries table (to be on timefusion), then it won't be in the same database as where the background jobs is, and hence would be unable to queue jobs from within the db.

It's a regular table.

tonyalaribe · 2026-01-27T21:33:10Z

Rename patterns function from 15mins to 5mins.

…uote sequences

…rom master Restore the original 0030_replay_event_file_count and 0031_slack_bot_token migrations from master and move log patterns changes into a new 0032 migration.

claude · 2026-02-25T23:28:50Z

PR Review: Log Patterns

Good overall design — detectSpikeOrDrop being pure and exported for unit testing is excellent, and the coverage across unit, integration, and bench specs is solid. A few concerns worth addressing:

Potential Bug: `parseIssueType` drops the `"shape"` alias

The old code had an explicit alias:

parseIssueType "shape" = Just APIChange -- Handle DB anomaly_type

The new implementation (rightToMaybe . parseUrlPiece) will return Nothing for "shape". If any code path (filters, URL params, existing rows) still passes "shape" here, it silently drops the issue type. Worth verifying no callers rely on this alias before merging.

Succinctness: `unzip` instead of two traversals

In extractFieldPatterns:

void $ LogPatterns.upsertLogPatternBatch (map fst allUps)
void $ LogPatterns.upsertHourlyStatBatch (map snd allUps)

allUps is traversed twice. unzip eliminates that:

let (ups, hss) = unzip allUps
void $ LogPatterns.upsertLogPatternBatch ups
void $ LogPatterns.upsertHourlyStatBatch hss

Same applies to persistSummaryPatterns where the two list comprehensions over prepared each re-traverse it — use unzip4 prepared instead.

Style: `_ <- ...; pass` vs `void $` inconsistency

The two new scheduling loops use the old pattern while the third already uses void $. Use void $ consistently across all three.

Style: `tryLog` control flow is subtle

\(e :: SomeException) ->
  whenJust (fromException @SomeAsyncException e) throwIO
    >> Log.logAttention ...

throwIO aborts so >> is dead in the async branch, but the sequencing makes it look like both effects always run. case is more explicit:

\(e :: SomeException) -> case fromException @SomeAsyncException e of
  Just ae -> throwIO ae
  Nothing -> Log.logAttention ("LogPattern pipeline step failed: " <> label) (show e)

`updateBaselineBatch`: 6-tuple decomposition is hard to read

updateBaselineBatch :: ... -> V.Vector (Text, Text, BaselineState, Double, Double, Int) -> ...
srcFields = V.map (\(s, _, _, _, _, _) -> s) rows
hashes    = V.map (\(_, h, _, _, _, _) -> h) rows

A small named record would let you write V.map (.sourceField) rows etc., and makes the construction site in calculateLogPatternBaselines self-documenting.

Minor: `Default LoggingDestination` derivation is order-sensitive

deriving anyclass (Default) picks def as the first constructor. StdOut is first so behaviour matches the removed manual instance — but constructor reordering would silently break it. A short comment helps preserve the intent.

Breaking change in `replaceAllFormats` — stored pattern drift

"Oct 14, 2023" now produces "Oct {integer}, {integer}" instead of "{Mon DD, YYYY}". The test update documents it, but patterns already stored in the DB under the old format will diverge from newly-generated ones, causing duplicate clusters at the boundary. Worth calling out explicitly if there is no migration to re-normalize stored patterns.

Highlights

V.minIndexBy (comparing lastSeen) — excellent use of existing library function, cleanly replaces the 10-line fold.
bool Drain.generateDrainTokens Drain.generateSummaryDrainTokens isSummary — clean function selection.
WrappedEnumSC "" for IssueType — great use of the project utility; eliminates all the manual FromField/ToField/Display/FromHttpApiData boilerplate.
anyclass (Default, NFData) cleanup across Anomalies, Fields, Logging — good housekeeping.
detectSpikeOrDrop being pure and exported is the right call; unit tests cover boundary cases including the MAD=0 floor.
resolveFieldExpr whitelist for SQL injection protection in RequestDumps is the right approach.

claude · 2026-02-25T23:36:00Z

PR Review: Log Patterns Feature

Good overall structure. The Drain algorithm integration, hourly stats table, and MAD-based spike detection are well thought-out. Several concrete improvements below, prioritised by impact.

Bugs / Correctness

replaceAllFormats: replaceEmails is O(n^2) in @-count

On a non-matching @ the current code does replaceEmails (T.drop 1 after), which re-scans the prefix. For a string with k @ chars that don't form valid emails this is O(n*k). The fix is to advance from after directly:

-- current (buggy):
else before <> "@" <> replaceEmails (T.drop 1 after)

-- fix: don't re-scan `before`:
else T.reverse prefixRev <> T.singleton '@' <> replaceEmails (T.drop 1 after)

acknowledgeLogPatterns – IN (SELECT unnest, unnest) is non-obvious

PostgreSQL does correlate both unnests pair-wise here, but the behaviour is surprising. Prefer an explicit lateral join for future readers:

FROM unnest(?::text[], ?::text[]) AS t(sf, ph)
WHERE source_field = t.sf AND pattern_hash = t.ph

Code Duplication (contradicts "less lines" priority)

upsertLogPattern / upsertHourlyStat duplicate their batch siblings' SQL entirely.

Both have identical ON CONFLICT clauses (~20 lines each). Delegate instead:

upsertLogPattern up = upsertLogPatternBatch [up]
upsertHourlyStat pid sf h t c = upsertHourlyStatBatch [(pid, sf, h, t, c)]

Hardcoded Enum Strings in SQL (type-safety regression)

pruneStalePatterns and autoAcknowledgeStaleNewPatterns compare against string literals:

[sql| WHERE state = 'acknowledged' ... |]   -- should be a typed parameter
[sql| WHERE state = 'new' ...          |]   -- should be a typed parameter

upsertLogPattern correctly uses the enum type in its ON CONFLICT clause. A renamed constructor silently breaks the prune queries. Fix:

pruneStalePatterns pid now staleDays =
  PG.execute [sql| DELETE FROM apis.log_patterns
                   WHERE project_id = ? AND last_seen_at < ?::timestamptz - INTERVAL '1 day' * ?
                   AND state = ? |]
             (pid, now, staleDays, LPSAcknowledged)

Same treatment for autoAcknowledgeStaleNewPatterns.

Opaque 6-tuple in `updateBaselineBatch`

updateBaselineBatch :: DB es => Projects.ProjectId
                   -> V.Vector (Text, Text, BaselineState, Double, Double, Int) -> Eff es Int64
srcFields = V.map (\(s, _, _, _, _, _) -> s) rows   -- 6 lines of this

A small record eliminates the positional destructuring and makes call sites self-documenting:

data BaselineUpdate = BaselineUpdate
  { buSourceField :: Text, buHash :: Text
  , buState :: BaselineState, buMean :: Double
  , buMad :: Double, buSamples :: Int }

Minor Issues

maybe 0 (\(Only n) -> n) → maybe 0 fromOnly in fetchLogPatterns (already used correctly in getTotalEventCount).

replaceAllFormats: isOctet does a toString/readMaybe round-trip. A direct fold avoids the String allocation:

isOctet t = T.length t <= 3 && T.all isDigit t &&
  T.foldl' (\acc c -> acc * 10 + (fromEnum c - 48)) 0 t <= (255 :: Int)

fetchLogPatterns – drop 1 $ T.splitOn "." f when f = "attributes." produces [""], passing an empty path to the #>> operator. Add a not (null parts) guard.

processBatch nested lambda can use flip + LambdaCase:

V.foldl' (flip \case
  SeedPattern c  -> processNewLog isSummary "" Nothing c now
  NewEvent lid c -> processNewLog isSummary lid (Just c) c now)
  inTree batch

Positives

ExceptT / runExceptT in Enhancement.hs – clean removal of nested case ladders.
MkIssueOpts record replacing 11 positional args – major readability win.
WrappedEnumSC "" for IssueType – removes ~20 lines of hand-written ToField/FromField.
deriving anyclass (Default) on enum types – correct use of Generic machinery.
DrainInput sum type replacing Bool isSampleLog threading through 5 call layers.
tryLog async-exception re-throw – correct pattern (fromException @SomeAsyncException).
Spike detection unit tests covering edge cases including MAD=0 and min-delta guard.
Integration tests covering end-to-end extraction → baseline → spike → issue creation → acknowledgment.

claude · 2026-02-25T23:37:45Z

Code Review: PR #300 -- Log Patterns Pipeline

Good overall structure. Here are the key findings:

Bugs

1. ON CONFLICT deduplication collapses unrelated issues when target_hash is empty
src/Models/Apis/Issues.hs -- insertIssue

The new unique index has no guard for empty hashes. For RuntimeException issues where atError.hash = Nothing, targetHash = fromMaybe "" ... = "". Two unrelated runtime exceptions from different services both get target_hash = "" and the same issue_type, so ON CONFLICT deduplicates them into one issue. The old code had AND endpoint_hash != '' in the partial index -- this guard needs to be restored:

CREATE UNIQUE INDEX ... WHERE acknowledged_at IS NULL AND archived_at IS NULL AND target_hash != '';

2. "shape" alias in parseIssueType is silently broken
src/Models/Apis/Issues.hs

The old parseIssueType "shape" = Just APIChange was removed. The new implementation delegates to WrappedEnumSC via parseUrlPiece, which has no alias support. If any DB record or code path still produces "shape", it silently returns Nothing rather than Just ApiChange. Check whether Anomalies.hs or any SQL still references anomaly_type = 'shape'.

Correctness / Logic

3. Spike projection is 4x too aggressive for the first 15 minutes of every hour
src/BackgroundJobs.hs -- detectLogPatternSpikes

let minutesIntoHour = max 15 $ todMin (timeToTimeOfDay $ utctDayTime now)

todMin returns only the minute component (0-59), not total minutes since the hour start. At HH:00:30 this gives max 15 0 = 15 -> scaleFactor = 4.0, causing systematic 4x over-projection and false positives at the top of every hour. Use total elapsed seconds instead:

let totalSecs = round (realToFrac (utctDayTime now) :: Double) :: Int
    minutesIntoHour = max 15 $ (totalSecs `mod` 3600) `div` 60

Code Conciseness / Best Practices

4. tryLog should use catchSync from safe-exceptions
src/BackgroundJobs.hs

safe-exceptions is already in deps and provides exactly this pattern without the manual fromException @SomeAsyncException check:

import Control.Exception.Safe (catchSync)

tryLog :: Text -> ATBackgroundCtx () -> ATBackgroundCtx ()
tryLog label = (`catchSync` \(e :: SomeException) ->
  Log.logAttention ("LogPattern pipeline step failed: " <> label) (show e))

5. updateBaselineBatch does 6 separate passes over the same vector
src/Models/Apis/LogPatterns.hs

Six V.map (\(a,_,_,_,_,_) -> a) calls each traverse the full vector. Use a named record for the tuple with RecordWildCards, or restructure as a single traversal.

6. extractFieldPatterns: use unzip instead of two list traversals
src/BackgroundJobs.hs

-- Current:
let allUps = urlUpserts ++ excUpserts
void $ LogPatterns.upsertLogPatternBatch (map fst allUps)
void $ LogPatterns.upsertHourlyStatBatch (map snd allUps)

-- Better:
let (ups, hss) = unzip (urlUpserts ++ excUpserts)
void $ LogPatterns.upsertLogPatternBatch ups
void $ LogPatterns.upsertHourlyStatBatch hss

7. processNewLog uses bool with non-obvious argument order
src/BackgroundJobs.hs

bool f g b is if b then g else f -- false-case first, opposite to natural reading. Prefer explicit if:

let tokens = (if isSummary then Drain.generateSummaryDrainTokens else Drain.generateDrainTokens) content

8. Deriving Default via anyclass on enum types -- good cleanup
src/Models/Apis/Anomalies.hs, src/Models/Apis/Fields.hs

Good consolidation. Confirm the first constructor in each type is the intended def (ATUnknown, AAUnknown, FTUnknown, FCQueryParam) -- these match the removed instances, so this is correct.

Test Coverage Gaps

9. Spike detection test has non-deterministic branching
test/integration/Pages/LogPatternsSpec.hs -- test 4

The test branches based on whether an established pattern already exists in the DB, seeding one manually only if not found. This covers different code paths depending on DB state. Make seed setup unconditional for determinism.

10. hashes column update not verified in extraction test
test/integration/Pages/LogPatternsSpec.hs -- test 1

The test checks that log_pattern_hourly_stats rows exist but does not verify that otel_logs_and_spans.hashes was tagged with "pat:..." values -- the primary side-effect of persistSummaryPatterns.

11. Missing boundary test for pruneOldHourlyStats
test/integration/Pages/LogPatternsSpec.hs -- test 7

Only verifies a stat at -100h is pruned. No test that a stat at -(window - 1)h is NOT pruned; an off-by-one in the SQL interval expression would be silent.

Info

Schema: log_patterns.state and baseline_state are TEXT rather than Postgres enums. Not a bug, but invalid values can be stored without DB-level rejection.
Job retry double-counting: ON CONFLICT DO UPDATE SET event_count = event_count + EXCLUDED.event_count is documented as acceptable. The z-score threshold of 3.0 and MAD floor of 1.0 should absorb occasional retries.

claude · 2026-02-25T23:44:04Z

PR Review: Log Pattern Processing

Good overall architecture — the extraction→baseline→spike→new-pattern pipeline is clearly structured, and the refactoring of mkIssue, enhanceIssueWithLLM (via ExceptT), and Default derivations is clean. A few conciseness and correctness notes below.

Potential Bug: `occurrence_count::INT` cast can overflow

In fetchLogPatterns, the precomputed-path query casts occurrence_count::INT but the column is BIGINT. For high-volume patterns this silently truncates.

Fix: select the column without the cast and convert with fromIntegral when constructing PatternRow.count.

Conciseness: duplicate SQL strings

upsertLogPattern and upsertLogPatternBatch share identical SQL (same for upsertHourlyStat / upsertHourlyStatBatch). Extracting each as a top-level Query constant lets both functions share it:

upsertLogPatternSql :: Query
upsertLogPatternSql = [sql| INSERT INTO apis.log_patterns (...) ON CONFLICT ... DO UPDATE SET ... |]

upsertLogPattern up = PG.execute upsertLogPatternSql up
upsertLogPatternBatch [] = pure 0
upsertLogPatternBatch ups = PG.executeMany upsertLogPatternSql ups

Conciseness: redundant `Data.List (lookup)` import

Relude already re-exports lookup, so import Data.List (lookup) in LogPatterns.hs is unnecessary.

Conciseness: `updateBaselineBatch` – 6 separate `V.map` passes

Six positional mappers on a 6-tuple vector is fragile. Data.List.Extra (already in the project) exposes unzip6. Using V.toList → unzip6 → V.fromList each, or a small local record, removes positional indexing.

Conciseness: `map fst knownPatternFields` repeated at call sites

Called at three sites (LogQueryBox, RequestDumps, LogPatterns). Exposing a knownSourceFields :: [Text] helper from LogPatterns avoids repeating the transformation:

knownSourceFields :: [Text]
knownSourceFields = map fst knownPatternFields

Conciseness: `prepared` 4-tuple in `persistSummaryPatterns`

(_, _, up, _) / (_, _, _, hs) positional access is fragile. The patternHash at position 2 is redundant — it already lives in UpsertPattern.hash. Dropping it yields a 3-tuple:

-- build as (filteredIds, up, hs)
forM_ prepared \(filteredIds, up, _) -> ...
void $ LogPatterns.upsertLogPatternBatch [up | (_, up, _) <- prepared]
void $ LogPatterns.upsertHourlyStatBatch [hs | (_, _, hs) <- prepared]

Conciseness: `classifyHexContinuation` uses `case () of _ | ...` guards

MultiWayIf (enabled in this project) is cleaner here:

-- current
case () of
  _ | len == 64 && atBoundary -> "{sha256}" ...
    | ...

-- cleaner
if | len == 64, atBoundary -> "{sha256}" ...
   | ...

(recognizeHex already uses case len of with guards which is fine.)

Performance: `replaceTextDates` does up to 12 traversals

The pre-pass traverses the string once per month name (up to 12 times). A single pass that scans for 3-letter word-boundary tokens and validates against a HashSet would be more efficient for this hot path.

Minor: `LogPatternState` missing `Default` instance

All other domain enums use deriving anyclass (Default). LogPatternState is the exception — LPSNew is the natural default.

Tests

Good end-to-end coverage in LogPatternsSpec.hs. The conditional branching in test 4 (spike detection) that forks based on whether an established pattern with positive MAD already exists could be simplified by always seeding a synthetic pattern at the start of the test. SpikeDetectionSpec is a good pure unit test addition.

Overall solid work. The SQL migration is careful (backfilling target_hash, placing ALTER TYPE outside the BEGIN/COMMIT block as required by PG), tryLog correctly re-throws async exceptions, and replacing regex-based replaceAllFormats with the single-pass scanner is a meaningful improvement.

- Use elapsed seconds for spike projection instead of todMin indirection - Remove redundant async exception re-throw in tryLog (UnliftIO.catch handles it) - Use lens tuple accessors in updateBaselineBatch for readability - Use unzip in extractFieldPatterns instead of map fst/snd - Replace bool with explicit if in processNewLog for clarity

claude · 2026-02-26T00:00:11Z

Code Review

Good overall implementation of the log pattern pipeline. A few issues worth addressing, ordered by severity.

Bugs / Correctness

1. parseIssueType loses the "shape" alias (regression risk)

Before, there was an explicit alias parseIssueType "shape" = Just APIChange for legacy DB rows. The new one-liner parseIssueType = rightToMaybe . parseUrlPiece drops it. If any rows have issue_type = 'shape', they will silently fail to parse. Either keep the alias via a custom FromHttpApiData instance, or confirm no such rows exist and add a migration to rewrite them.

2. detectLogPatternSpikes ignores _scheduledTime

The handler receives LogPatternPeriodicProcessing _scheduledTime pid but calls now <- Time.currentTime (wall clock) for the minutesIntoHour calculation. Other jobs use the scheduled time for window calculations. If the queue is backed up, minutesIntoHour will reflect actual wall time rather than the intended window start. Either use _scheduledTime or document why wall clock is correct here.

3. LogPatternRateChange badge always shows "trending-up" icon

Issues.LogPatternRateChange -> ("bg-fillWarning-strong", "trending-up", "RATE CHANGE")

The issue can represent a spike or a drop, but the icon is hardcoded to "trending-up". The changeDirection field exists in issueData — either decode it at the badge site or use a neutral icon like "activity".

Code Conciseness / DRY

4. upsertLogPattern duplicates upsertLogPatternBatch SQL verbatim

The entire SQL body is copy-pasted between the two. Since executeMany works with a singleton list, this is sufficient:

upsertLogPattern :: DB es => UpsertPattern -> Eff es Int64
upsertLogPattern = upsertLogPatternBatch . pure

5. updateBaselineBatch uses view _1..view _6 on a 6-tuple

let srcFields = V.map (view _1) rows
    hashes    = V.map (view _2) rows
    ...
    samples   = V.map (view _6) rows

Pulling in Control.Lens just to project positional tuple fields is heavy. At minimum, use pattern-matching lambdas:

let srcFields = V.map (\(f,_,_,_,_,_) -> f) rows

Better still: define a BaselineUpdate record and project via field names. The same 6-tuple also makes calculateLogPatternBaselines fragile (V.filter (\(_, _, s, _, _, _) -> s == BSEstablished)). A named record cleans up both sites.

6. Three identical case AE.fromJSON (getAeson issue.issueData) blocks in anomalyDetailPage

Enhancement.hs already introduced withIssueData for this exact pattern. A rendering-context version collapses the three separate case/_ -> pass blocks:

withIssueDataH :: (AE.FromJSON a, Applicative m) => Issue -> (a -> m ()) -> m ()
withIssueDataH issue f = case AE.fromJSON (getAeson issue.issueData) of
  AE.Success d -> f d
  _            -> pure ()

Usage:

Issues.LogPattern           -> withIssueDataH @Issues.LogPatternData issue \d -> ...
Issues.LogPatternRateChange -> withIssueDataH @Issues.LogPatternRateChangeData issue \d -> ...

SQL Type Safety

7. Literal 'new'/'acknowledged' strings in prune functions

-- pruneStalePatterns
AND state = 'acknowledged'
-- autoAcknowledgeStaleNewPatterns
WHERE state = 'new'
SET state = 'acknowledged'

The rest of the module uses LPSAcknowledged/LPSNew via the WrappedEnumSC ToField instance. Typos in these literals won't be caught at compile time. Use parameterized values consistently:

AND state = ?   -- bind LPSAcknowledged / LPSNew

8. Parallel unnest in acknowledgeLogPatterns — prefer ROWS FROM

WHERE (source_field, pattern_hash) IN (SELECT unnest(?::text[]), unnest(?::text[]))

This works on Postgres >=9.4 but the safer canonical idiom is:

WHERE (source_field, pattern_hash) IN (
  SELECT * FROM unnest(?::text[], ?::text[]) AS t(source_field, pattern_hash)
)

Minor

UtilsBenchSpec uses wall-clock getCurrentTime diffs for timing assertions, which can flake on loaded CI. Consider tasty-bench or criterion if timing correctness matters long-term.
getLogPatternTexts has no LIMIT and logWarn fires after the full load. Consider adding LIMIT 6000 and warning before loading.

Overall the mkIssue smart constructor, runExceptT chains in Enhancement.hs, deriving anyclass Default, and the replaceAllFormats single-pass rewrite are all solid improvements. Main asks: fix the parseIssueType regression risk and replace the 6-tuple with a named record.

claude · 2026-02-26T00:02:37Z

Code Review: Log Patterns Pipeline

Great feature overall — the architecture is solid and the new MkIssueOpts/mkIssue builder is a nice improvement. Here are issues worth addressing, roughly prioritised.

Bugs / Correctness

1. occurrence_count truncation in fetchLogPatterns

occurrence_count is BIGINT in the schema; occurrence_count::INT in the SQL silently truncates values > 2^31. The Haskell tuple type Int should be Int64, and the SQL cast should be ::BIGINT.

2. scaleFactor precision in detectLogPatternSpikes

utctDayTime :: DiffTime routed through Double loses sub-second precision unnecessarily. Prefer truncate (utctDayTime now) :: Int.

3. paginateTree missing type signature

The paginateTree where-binding inside logsPatternExtraction has no type annotation. Add one.

Code Duplication

4. Copy-pasted SQL in upsert helpers

upsertLogPattern/upsertLogPatternBatch and upsertHourlyStat/upsertHourlyStatBatch contain identical SQL strings. Either extract as a named Query constant or implement the single-item variant via the batch one (upsertLogPattern up = upsertLogPatternBatch [up]).

Packages / Extensions Already Available

5. Lens-based 6-tuple access in updateBaselineBatch

Using view _1 ... view _6 makes 6 passes over the vector and pulls in Control.Lens just for this one function. Plain pattern matching in a single V.map removes the dependency entirely. Better still: introduce a small record type for the batch row to avoid positional 6-tuples altogether.

6. Redundant Data.List (lookup) import in LogPatterns.hs

Relude re-exports Prelude.lookup; the explicit import is a no-op.

Design / Clarity

7. Deprecated endpointHash on Issue with no tracking ticket

mkIssue always sets endpointHash = opts.targetHash. Every insert and select carries a redundant column. The inline comment notes it should be dropped once queries are migrated — please open a follow-up issue.

8. knownPatternFields includes "body" but nothing writes to it

extractSummaryPatterns uses sourceField = "summary" and extractFieldPatterns produces "url_path" / "exception". The "body" entry will appear in the UI dropdown but always return empty results.

9. Retry-induced double-counting in hourly stats

The upsertHourlyStat comment acknowledges: "SUM semantics mean job retries double-count." A crash-and-retry within the same 5-minute window doubles the bucket count, which can trigger false-positive spike alerts. If the extraction run re-aggregates from source data, GREATEST(existing, excluded) is idempotent and avoids this. Worth a deliberate decision rather than an accepted limitation.

10. ON CONFLICT DO NOTHING without a conflict target in insertTestLog

Omitting the conflict target silently suppresses any constraint violation during tests. Use ON CONFLICT (id) DO NOTHING.

Minor

11. Operator precedence in shouldSkipLoad

&& binding tighter than || makes the expression correct but easy to misread. Wrap the && group in explicit parentheses.

12. UtilsBenchSpec in unit-tests stanza

If this runs criterion/tasty-bench it will inflate unit test runtimes. Benchmarks should be a separate cabal stanza.

…ate suite - upsertLogPattern now delegates to upsertLogPatternBatch via pure - Replace repeated case AE.fromJSON/getAeson/pass blocks with withIssueDataH - Move UtilsBenchSpec out of unit tests into dedicated bench suite - Pass scheduledTime to detectLogPatternSpikes for accurate partial-hour projection

claude · 2026-02-26T00:16:21Z

PR Review: Log Patterns Feature

Overall a solid, well-structured PR with good use of GHC extensions and batch DB operations. A few suggestions for improvement.

Positives

mkIssue + MkIssueOpts builder — eliminates ~90 lines of repeated Issue construction across all issue creation functions. Excellent refactor.
runExceptT/hoistEither in Enhancement.hs — the nested case chains were rightfully replaced; much cleaner.
deriving anyclass (Default) instead of manual instances throughout — good cleanup of boilerplate.
withIssueDataH helper — avoids the repeated AE.fromJSON (getAeson ...) >>= case pattern across the UI layer.
Batch DB operations everywhere (upsert, baseline update, stats) — avoids N+1.

Suggestions

1. updateBaselineBatch — 6-tuple indexed by lens is opaque

let srcFields = V.map (view _1) rows
    hashes    = V.map (view _2) rows
    states    = V.map (view _3) rows
    means     = V.map (view _4) rows
    mads      = V.map (view _5) rows
    samples   = V.map (view _6) rows

This 6-tuple is hard to read and maintain. A small named record:

data BaselineUpdate = BaselineUpdate
  { sourceField :: Text, patternHash :: Text, state :: BaselineState
  , mean :: Double, mad :: Double, samples :: Int }

…would let you use V.map (.sourceField) rows etc., removing the lens dependency from this function and making future changes obvious.

2. paginateTree event-meta 3-tuple should be a record

HM.HashMap Text (Maybe Text, Maybe Text, Maybe Text)
-- traceId?  serviceName? level?

Anonymous 3-tuple of Maybe Text is indistinguishable at a glance. A tiny record (or even a type alias with a comment) would make the destructuring in persistSummaryPatterns self-documenting.

3. fetchLogPatterns on-the-fly branch is too large

The function is ~100 lines mixing: SQL building, Drain tree construction, aggregation, and sparkline computation. The on-the-fly branch should be extracted as fetchLogPatternsOnTheFly (or similar) so the top-level function remains a readable dispatcher between precomputed and live paths.

4. Hourly stats are not idempotent on job retry

-- Note: SUM semantics mean job retries double-count.

This is acknowledged but worth addressing. A double-counted hourly bucket inflates the MAD baseline, making the MAD effectively 0 for that hour, which lowers the noise floor and could either suppress or trigger false spike alerts. Since MAD is robust to occasional outliers, this is probably fine for now — but it's worth a follow-up issue, especially once this runs at scale.

5. endpointHash deprecation needs a tracking issue

, endpointHash :: Text -- Deprecated: always == targetHash (set by mkIssue). Drop column once ...

No follow-up issue is referenced. This will persist indefinitely without one. A -- TODO(#NNN) link would help.

6. replaceAllFormats — fromEnum c - 48 in isOctet

T.foldl' (\acc c -> acc * 10 + (fromEnum c - 48)) 0 t <= (255 :: Int)

This works but ord c - ord '0' (or just digitToInt c from Data.Char) is more readable. The old regex-based implementation was replaced for good reason (performance), but this helper can be a bit clearer.

7. Benchmark suite uses hspec wall-clock for timing assertions

elapsed `shouldSatisfy` (< 1.0) -- 2k rows under 1 second

This is brittle under CI load variance. Fine as a smoke test but consider tasty-bench or criterion if you want meaningful regression tracking. Alternatively, make the threshold generous enough that it only fails on catastrophic slowdowns.

8. selectIssues raw query still uses hardcoded column positions

The selectIssues query and IssueL's FromRow are tied by positional binding. With target_hash added and service changed to Maybe Text, this coupling is a latent bug source. Consider migrating to postgresql-entity's _select like findOpenIssueForEndpoint was (nicely done there), or at least document the column order explicitly near the IssueL definition.

Minor

In processNewLogPatterns, the acknowledgement of ALL new patterns happens before checking volume threshold, which means low-volume projects will silently drop new-pattern alerts. This is documented in the comment ("even when volume too low") but could surprise operators — a Log.logInfo confirming how many were acknowledged-without-issue would be useful observability.
bool BSLearning BSEstablished est is idiomatic but some readers may not immediately recognise Data.Bool.bool; a brief comment or if est then BSEstablished else BSLearning is equally short and universally clear.

claude · 2026-02-26T00:21:02Z

Code Review — PR #300: Log Patterns

Solid feature work overall — the pipeline design (extract → baseline → spike/drop → new pattern) is well-structured, the SQL is correct, and the doctest coverage on replaceAllFormats is excellent. The runExceptT/hoistEither refactoring in Enhancement.hs is a clear improvement. A few issues worth addressing before merging.

Bug / Security

getLogPatternById / getLogPatternsByIds missing project-level filter

-- LogPatterns.hs
getLogPatternById :: DB es => LogPatternId -> Eff es (Maybe LogPattern)
getLogPatternById lpid = listToMaybe <$> PG.query (_selectWhere @LogPattern [[DAT.field| id |]]) (Only lpid)

getLogPatternsByIds :: DB es => V.Vector LogPatternId -> Eff es (V.Vector LogPattern)
getLogPatternsByIds ids
  | V.null ids = pure V.empty
  | otherwise = V.fromList <$> PG.query (_select @LogPattern <> " WHERE id = ANY(?)") (Only ids)

Every other function in this module takes Projects.ProjectId and filters on it. These two don't. If a caller ever derives IDs from user input or another project's context, they can leak patterns across project boundaries. Add projectId to both signatures and filter on it (consistent with every other function in this module).

Correctness / Subtle Bug

persistSummaryPatterns runs N individual UPDATEs against otel_logs_and_spans

forM_ prepared \(filteredIds, patternHash, _, _) -> do
  let hashTag = "pat:" <> patternHash
  void $ withTimefusion tfEnabled $ PG.execute [sql|UPDATE otel_logs_and_spans SET hashes = array_append(hashes, ?) WHERE project_id = ? AND timestamp >= ? AND id = ANY(?::uuid[]) AND NOT (hashes @> ARRAY[?])|] (hashTag, pid, since, filteredIds, hashTag)

For a project with many patterns this is O(patterns) queries against otel_logs_and_spans, which can be large. The existing pattern in this codebase for bulk updates is unnest() arrays — the same approach used in updateBaselineBatch. Consider a single query: UPDATE … SET hashes = array_append(hashes, tag_for_id) FROM (SELECT unnest(?::uuid[]) AS id, unnest(?::text[]) AS tag) v WHERE …. At minimum, a log warning when length prepared > N would catch unexpectedly large batches.

Code Conciseness

6-tuple V.Vector with view _1…view _6 — use a record or V.unzip

-- updateBaselineBatch — requires Control.Lens import just for this
let srcFields = V.map (view _1) rows
    hashes    = V.map (view _2) rows
    states    = V.map (view _3) rows
    means     = V.map (view _4) rows
    mads      = V.map (view _5) rows
    samples   = V.map (view _6) rows

With RecordWildCards available, a small named record (or even just pattern-matching with a helper lambda) avoids the Lens import and the indexed-field mystery. Alternatively, accept V.Vector PatternBaselineUpdate where data PatternBaselineUpdate = PatternBaselineUpdate { sourceField, patternHash, state, mean, mad, samples } and unzip inside. The 6-tuple signature leaks into every caller.

monthSet defined inside replaceAllFormats

replaceAllFormats !input = … go Nothing (replacePrePass input)
  where
    monthSet :: HS.HashSet Text
    monthSet = HS.fromList ["Jan","Feb",…]

This is a constant with no free variables. GHC may float it as a CAF, but it's not guaranteed — move it to a top-level binding to make the intent explicit and avoid any risk of repeated allocation on every call to replaceAllFormats.

sourceFieldLabel can use lookup from Prelude more concisely

sourceFieldLabel f = fromMaybe f $ lookup f knownPatternFields
-- Already concise. But knownPatternFields is a list — for 4 entries a Map or plain list is fine; 
-- just making sure it's not accidentally O(n) in a hot path (it isn't, 4 entries, no concern).

OK as-is.

processNewLogPatterns: unnecessary V.fromList $ filter then V.forM

let issueWorthy = V.fromList $ filter (\lp -> lp.sourceField /= "url_path") newPatterns
issueIds <- V.forM issueWorthy \lp -> …

newPatterns is a [LogPattern]. Filter, then forM (list), then collect — no need for the intermediate Vector:

let issueWorthy = filter (\lp -> lp.sourceField /= "url_path") newPatterns
issueIds <- forM issueWorthy \lp -> …
unless (null issueIds) $ …

Design / Maintainability

knownPatternFields lists "body" but the extraction pipeline never processes it

logsPatternExtraction only handles summary, url_path, and exception. The "body" -> "Log body" entry in knownPatternFields (and the corresponding selector in LogQueryBox) suggests an unimplemented path. Either implement it or remove the entry to avoid user confusion (the UI will offer it but it will always fall back to the on-the-fly query in fetchLogPatterns).

LPSIgnored is a dead state

data LogPatternState = LPSNew | LPSAcknowledged | LPSIgnored

Nothing in this PR transitions a pattern to LPSIgnored, and pruneStalePatterns only prunes LPSAcknowledged. If this is future work, a TODO comment would help; if not planned, remove it to keep the state machine honest.

BatchPatternStats.totalEvents is computed but unused

calculateLogPatternBaselines uses hourlyMedian, hourlyMADScaled, and totalHours from BatchPatternStats, but not totalEvents. The CTE and COALESCE(SUM(event_count),0) column in getBatchPatternStats are dead weight. Remove the field (or use it — it would let you warn on patterns with very few total events, which might produce unreliable baselines).

endpointHash as deprecated alias

, endpointHash :: Text -- Deprecated: always == targetHash. Drop column once selectIssues/selectIssueByHash are migrated.

The migration backfills and the comment acknowledges the debt. It's fine for now, but worth tracking as a follow-up since it keeps two fields in sync by convention rather than by code.

Minor

upsertHourlyStat delegates to upsertHourlyStatBatch . pure — clean and concise, no issues.
detectSpikeOrDrop exported from BackgroundJobs — pure function, easy to test. The unit test spec looks solid. One edge case not tested: mad = 0, mean = 0, currentCount = 0 (floor at 1.0 makes z = 0, so no alert — correct).
replaceAllFormats rewrite — the single-pass scanner is a meaningful improvement over the multi-regex approach. The doctest suite is thorough. The pre-pass for emails/JWTs/text dates before the main go scanner is a reasonable decomposition.
bool BSLearning BSEstablished est — nice use of Data.Bool.bool.
tryLog wrapper — good defensive pattern for the pipeline steps.

Overall the logic is sound. The three most important things to fix before merging: the missing projectId filter on getLogPatternById/getLogPatternsByIds, the dead LPSIgnored state (or document it), and the "body" field in knownPatternFields that has no backing extraction.

claude · 2026-02-26T00:32:17Z

Code Review — Log Patterns (#300)

Overall this is a well-structured addition. The baseline/MAD spike detection, the Drain integration, and the migration are solid. A few issues worth addressing before merge:

1. Replace the 6-tuple in `updateBaselineBatch` with a named record

-- Current: fragile — mean and mad are both Double, easy to swap silently
updateBaselineBatch :: DB es => Projects.ProjectId -> V.Vector (Text, Text, BaselineState, Double, Double, Int) -> Eff es Int64

let srcFields = V.map (view _1) rows
    hashes    = V.map (view _2) rows
    states    = V.map (view _3) rows
    means     = V.map (view _4) rows
    mads      = V.map (view _5) rows
    samples   = V.map (view _6) rows

You already have the UpsertPattern and BatchPatternStats pattern — define a BaselineUpdate record (or reuse an inline record with {sourceField, patternHash, baselineState, mean, mad, samples}) and destructure it in the SQL unnest block. Eliminates the Control.Lens._1/_6 import and makes the type self-documenting.

2. N+1 SQL round-trips in `persistSummaryPatterns`

forM_ prepared \(filteredIds, patternHash, _, _) -> do
  let hashTag = "pat:" <> patternHash
  void $ withTimefusion tfEnabled $ PG.execute [sql|UPDATE otel_logs_and_spans SET hashes = array_append(hashes, ?) ...|] (hashTag, pid, since, filteredIds, hashTag)

This fires one round-trip per Drain pattern group (typically 20–100 per 5-minute job). You're already using unnest everywhere else — a single batched update via unnest(?::text[], ?::uuid[]) or a subquery join would replace the loop entirely:

UPDATE otel_logs_and_spans SET hashes = array_append(hashes, v.tag)
FROM (SELECT unnest(?::uuid[]) AS id, unnest(?::text[]) AS tag) v
WHERE otel_logs_and_spans.id = v.id
  AND project_id = ? AND timestamp >= ?
  AND NOT (hashes @> ARRAY[v.tag])

Where you pass the flattened (allIds, corresponding tags) from the prepared list.

3. `knownPatternFields` — export `knownSourceFields` to avoid `map fst` at call sites

-- Log.hs
if target `elem` map fst LogPatterns.knownPatternFields

-- Settings.hs
let isCustom = maybe False (\s -> s `notElem` map fst knownPatternFields) config.patternSelected

Add knownSourceFields :: [Text]\nknownSourceFields = map fst knownPatternFields to LogPatterns exports. Two call sites become elem LogPatterns.knownSourceFields with no extra allocation.

4. `detectSpikeOrDrop` — 4 consecutive `Double` parameters

detectSpikeOrDrop :: Double -> Double -> Double -> Double -> Int64 -> Maybe Issues.SpikeResult
detectSpikeOrDrop zThreshold minDelta mean mad currentCount

mean and mad can be silently transposed at the call site. The tests do catch this for the fixed-threshold calls, but callers passing dynamic values (e.g. from BatchPatternStats.hourlyMedian/hourlyMADScaled) have no protection. A simple data BaselineStats = BaselineStats { mean :: Double, mad :: Double } record (or even collapsing the first two params into a SpikeConfig) would eliminate the risk.