Locale & timezone faithfulness: serve non-English users and correct local time
Context
The inbox-intelligence extractors lean on English by default — keyword classifiers,
commitment regexes ("I'll", "I will"), and an English-centric date library. A user who
writes and receives mail in another language falls through to near-empty results: no
commitments found, no deadlines, no security flags, a generic briefing. Separately,
relative deadlines ("by Friday", "end of day") can't resolve correctly because there
is no per-user timezone. This spec makes the pipeline faithful to who the user is:
their language and their clock. It unblocks the quality ceiling of specs 02/03/06 for
the non-English and non-UTC population.
Current State
Verified 2026-06-06.
packages/policy-prompts/prompts/briefing-prose/v1.md:30 — the prompt accepts
{{language}}, but apps/worker/src/jobs/briefing-generator.ts:346 passes it
hardcoded 'en', never from a user profile.
packages/db/src/schemas/schema.sql (users table, ~lines 8-16) has no
language, locale, or timezone column (verified: grep for those names returns
nothing on the users table).
packages/decision-engine/src/situation-interpreter.ts:107-192 — classifySituation
matches English-only keywords (meeting, invoice, payment, deadline, etc.).
No locale awareness anywhere in signal classification.
- Planned specs lean English: spec 02's rule fallback regex, spec 06's security
markers, and spec 03's suggested chrono-node are all English-first. The LLM
strategies in those specs can handle other languages, but the deterministic
fallbacks cannot — and nothing routes non-English users to the LLM path on purpose.
Proposed Change
- Add
language and timezone to the user profile, populated from the connector
identity where available (Google profiles expose locale and the calendar carries
a timezone), with a safe default and a settings override.
- Wire the briefing language from the profile instead of the hardcoded
'en'.
- Make extraction locale-aware by routing, not by translating every rule. The
honest, low-cost design: the LLM strategy is the multilingual path; the English
regex/keyword fallbacks are explicitly one locale's fallback. When the content
language is non-English AND only the deterministic fallback is available, the
extractor MUST log that it is running degraded (English rules on non-English text)
rather than silently returning empty — so coverage gaps are visible (feeds spec 13).
- Timezone feeds spec 03 so relative deadlines resolve against the user's clock,
not UTC.
Synthetic illustration (no real data): a commitment "te lo envío mañana" or
"明日送ります" should be caught by the LLM commitment strategy; the English regex
fallback would miss it, and in that case the extractor records a degraded:locale
note instead of a false "no commitments."
Implementation Details
- Migration — add
language STRING and timezone STRING to the users table
(nullable; backfill null). A new packages/db/src/migrations/0NN-user-locale-tz.sql.
- Population — on connector auth / profile sync, capture Google
locale →
language and the primary calendar's timezone → timezone. Fallbacks: language
defaults to 'en', timezone defaults to UTC with a logged warning (never
guess silently). A settings UI override is a thin add (out of scope to design here;
the column + API field is enough).
- Briefing wiring —
briefing-generator.ts:346 reads profile.language ?? 'en'.
- Locale-aware extraction contract — extend the extractor interface (specs
02/03/06, all consuming spec 07's SignalText) with locale derived from
SignalText (detected content language or the user's language). Strategy:
- LLM strategy available → it handles the locale natively (it already does);
pass locale so the prompt can be locale-appropriate.
- Only deterministic fallback available AND locale ≠ supported-rule-locale →
run the fallback but emit a degraded: 'locale' marker on the result and log
it. Do NOT silently return [] as if the content had no commitments/deadlines.
- Detection — use a lightweight language hint (the user's
language, or a small
script/charset heuristic on the body) to decide the route. Do not add a heavy
language-detection dependency for v1; the user-profile language is the primary
signal.
- Timezone to spec 03 —
deadline-extractor reads timezone from the profile
(replacing its UTC fallback) for end-of-day / weekday resolution.
Acceptance Criteria
- The users table has
language and timezone columns; a new user populated from a
non-English Google profile gets the right language and a non-UTC timezone.
- The briefing's
{{language}} reflects the user's profile, not hardcoded 'en'
(verified: a non-English profile yields a non-en value passed to the prompt).
- A non-English commitment/deadline/security fixture is correctly extracted via the
LLM strategy (locale passed through).
- When only the deterministic fallback runs on non-English content, the result
carries a degraded: 'locale' marker and a log line — it does NOT silently return
empty.
- Relative deadlines resolve in the user's timezone (a "by Friday" fixture resolves
differently for a UTC+9 vs UTC-8 profile).
- Missing timezone falls back to UTC WITH a logged warning (never a silent guess).
- English users are unaffected (no regression).
- Tests written and passing. No degradation of existing functionality.
Testing Plan
| Layer |
What |
Count |
| Unit |
Profile population from connector locale/tz; defaults + warnings |
+3 |
| Unit |
Briefing language sourced from profile, not hardcoded |
+2 |
| Unit |
Extractor routing: LLM for non-English; fallback emits degraded |
+3 |
| Unit |
Timezone resolution for relative deadlines across two zones |
+2 |
| Integration |
Non-English fixture → commitments/deadlines extracted end-to-end |
+2 |
| Migration |
Columns added; existing rows backfill null without error |
+1 |
Rollback Plan
Columns are additive/nullable. Briefing language read falls back to 'en' if the
column is null, so reverting is a no-op for existing users. The extraction routing is
flag-guarded (LOCALE_AWARE_EXTRACTION); off → current English-first behavior. No
destructive migration.
Effort Estimate
- Migration + profile population from connectors: ~4h
- Briefing language wiring: ~1h
- Locale-aware routing +
degraded marker + logging: ~4h
- Timezone wiring into spec 03: ~2h
- Tests: ~4h
Total: ~2 days.
Files Reference
| File |
Change |
packages/db/src/migrations/0NN-user-locale-tz.sql |
New: language, timezone columns |
apps/worker/src/jobs/briefing-generator.ts:346 |
Read language from profile |
packages/decision-engine/src/situation-interpreter.ts:107-192 |
Reference (English-keyword classifier) |
| extractors (specs 02/03/06) |
Accept locale; LLM-route non-English; mark degraded on fallback |
| connector auth / profile sync |
Capture locale + timezone |
Out of Scope
- Translating every keyword/regex set into N languages (LLM-first is the strategy;
rule sets stay English as one locale's fallback).
- A full settings UI for language/timezone (column + API field only here).
- Heavy ML language detection (profile language is the primary route signal).
- RTL / locale-specific UI layout (a separate design concern).
Related
- Unblocks non-English quality for specs 02 (commitments), 03 (deadlines, also needs
the timezone), 06 (security markers).
degraded: 'locale' markers feed spec 13's coverage transparency.
- Consumes spec 07's
SignalText (the locale travels on it).
Locale & timezone faithfulness: serve non-English users and correct local time
Context
The inbox-intelligence extractors lean on English by default — keyword classifiers,
commitment regexes ("I'll", "I will"), and an English-centric date library. A user who
writes and receives mail in another language falls through to near-empty results: no
commitments found, no deadlines, no security flags, a generic briefing. Separately,
relative deadlines ("by Friday", "end of day") can't resolve correctly because there
is no per-user timezone. This spec makes the pipeline faithful to who the user is:
their language and their clock. It unblocks the quality ceiling of specs 02/03/06 for
the non-English and non-UTC population.
Current State
Verified 2026-06-06.
packages/policy-prompts/prompts/briefing-prose/v1.md:30— the prompt accepts{{language}}, butapps/worker/src/jobs/briefing-generator.ts:346passes ithardcoded
'en', never from a user profile.packages/db/src/schemas/schema.sql(users table, ~lines 8-16) has nolanguage,locale, ortimezonecolumn (verified: grep for those names returnsnothing on the users table).
packages/decision-engine/src/situation-interpreter.ts:107-192—classifySituationmatches English-only keywords (
meeting,invoice,payment,deadline, etc.).No locale awareness anywhere in signal classification.
markers, and spec 03's suggested
chrono-nodeare all English-first. The LLMstrategies in those specs can handle other languages, but the deterministic
fallbacks cannot — and nothing routes non-English users to the LLM path on purpose.
Proposed Change
languageandtimezoneto the user profile, populated from the connectoridentity where available (Google profiles expose
localeand the calendar carriesa timezone), with a safe default and a settings override.
'en'.honest, low-cost design: the LLM strategy is the multilingual path; the English
regex/keyword fallbacks are explicitly one locale's fallback. When the content
language is non-English AND only the deterministic fallback is available, the
extractor MUST log that it is running degraded (English rules on non-English text)
rather than silently returning empty — so coverage gaps are visible (feeds spec 13).
not UTC.
Synthetic illustration (no real data): a commitment "te lo envío mañana" or
"明日送ります" should be caught by the LLM commitment strategy; the English regex
fallback would miss it, and in that case the extractor records a
degraded:localenote instead of a false "no commitments."
Implementation Details
language STRINGandtimezone STRINGto the users table(nullable; backfill null). A new
packages/db/src/migrations/0NN-user-locale-tz.sql.locale→languageand the primary calendar's timezone →timezone. Fallbacks:languagedefaults to
'en',timezonedefaults to UTC with a logged warning (neverguess silently). A settings UI override is a thin add (out of scope to design here;
the column + API field is enough).
briefing-generator.ts:346readsprofile.language ?? 'en'.02/03/06, all consuming spec 07's
SignalText) withlocalederived fromSignalText(detected content language or the user'slanguage). Strategy:pass
localeso the prompt can be locale-appropriate.run the fallback but emit a
degraded: 'locale'marker on the result andlogit. Do NOT silently return
[]as if the content had no commitments/deadlines.language, or a smallscript/charset heuristic on the body) to decide the route. Do not add a heavy
language-detection dependency for v1; the user-profile language is the primary
signal.
deadline-extractorreadstimezonefrom the profile(replacing its UTC fallback) for end-of-day / weekday resolution.
Acceptance Criteria
languageandtimezonecolumns; a new user populated from anon-English Google profile gets the right
languageand a non-UTCtimezone.{{language}}reflects the user's profile, not hardcoded'en'(verified: a non-English profile yields a non-
envalue passed to the prompt).LLM strategy (locale passed through).
carries a
degraded: 'locale'marker and a log line — it does NOT silently returnempty.
differently for a UTC+9 vs UTC-8 profile).
Testing Plan
degradedRollback Plan
Columns are additive/nullable. Briefing language read falls back to
'en'if thecolumn is null, so reverting is a no-op for existing users. The extraction routing is
flag-guarded (
LOCALE_AWARE_EXTRACTION); off → current English-first behavior. Nodestructive migration.
Effort Estimate
degradedmarker + logging: ~4hTotal: ~2 days.
Files Reference
packages/db/src/migrations/0NN-user-locale-tz.sqllanguage,timezonecolumnsapps/worker/src/jobs/briefing-generator.ts:346packages/decision-engine/src/situation-interpreter.ts:107-192locale; LLM-route non-English; markdegradedon fallbackOut of Scope
rule sets stay English as one locale's fallback).
Related
the timezone), 06 (security markers).
degraded: 'locale'markers feed spec 13's coverage transparency.SignalText(the locale travels on it).