Skip to content

Stale-while-revalidate for the workflow detail fetch#3540

Merged
vegaro merged 7 commits into
mainfrom
cesar/workflow-detail-swr
Jun 8, 2026
Merged

Stale-while-revalidate for the workflow detail fetch#3540
vegaro merged 7 commits into
mainfrom
cesar/workflow-detail-swr

Conversation

@vegaro

@vegaro vegaro commented Jun 5, 2026

Copy link
Copy Markdown
Member

Linear https://linear.app/revenuecat/issue/WFL-281/stale-while-revalidate-for-the-workflow-detail-fetch

Stacked on #3537.

I noticed that in OfferingsManager we do stale-while-revalidate serving (vendCachedOfferingsAndMaybeRefresh). Meaning that we serve stale data while fetching in the background more fresh offerings. We were not doing that for workflows, so there was a change in behavior. Before this PR a "stale" cache was a "miss" that blocked the paywall on a full backend round-trip.

To understand it better, I've generated these diagrams:

How offerings already behaves

When a cached value exists it is always served right away; staleness only decides whether a background refresh is also kicked off. The caller never waits on the network when the cache has something to give.

Captura de pantalla 2026-06-05 a las 8 14 17

Workflows before this PR

getWorkflow only short-circuited on a fresh hit. A cached-but-stale workflow fell through to the same path as a cache miss and blocked the render on a full refetch, even though a perfectly usable workflow was sitting in the cache.

Captura de pantalla 2026-06-05 a las 8 15 25

Workflows after this PR

Now getWorkflow serves the cached value immediately, then fire a background refresh that delivers no callbacks (a success just updates the cache, a failure is logged and swallowed).

Captura de pantalla 2026-06-05 a las 8 16 53
📋 Design context (for reviewers and AI agents)

Why

This is follow-up #3 from the workflow-detail-envelope-persistence decision. It is a different layer from the backend-down recovery work in #3537 underneath it: this is about normal in-memory serving, not disk fallback. The goal is parity with OfferingsManager, which has always done stale-while-revalidate. The parity question applies to any SDK that caches workflow details (iOS once it has the cache layer — it is a stage behind; no iOS action here).

The change

getWorkflow is rewritten from a two-way (fresh → serve, everything-else → block-and-fetch) into a three-way branch mirroring vendCachedOfferingsAndMaybeRefresh:

Cache state Behavior Changed?
Fresh hit serve cached, no network unchanged
Stale hit serve cached immediately via onSuccess, then fire a background refresh that updates the cache only — no caller onSuccess/onError; a failed refresh is logged and swallowed this PR
Miss fetch from backend, deliver via onSuccess/onError unchanged

Mechanically: the fetch + resolve + cache + callback block was extracted into a private fetchAndCacheWorkflow(...) (pure extraction, no behavior change), and getWorkflow now dispatches into it from the three branches.

Prefetch carve-out — the one deliberate exception

The split is expressed as a single parameter: getWorkflow(..., staleWhileRevalidate: Boolean = true). Fresh and miss are identical across callers, so one boolean captures the difference cleanly (it only affects the stale-present case).

  • Render / on-demand path (PurchasesOrchestrator.getWorkflow): staleWhileRevalidate = true (default).
  • Prefetch path (prefetchWorkflow): staleWhileRevalidate = false.

Prefetch must NOT adopt SWR. Its job is to force a real refresh and persist the detail envelope (the #3537 backend-down-recovery feature). If prefetch served stale + background-refreshed, its stale case would stop blocking-and-persisting and weaken that guarantee. So prefetch keeps today's blocking-fetch-and-persist behavior on a stale workflow.

The background refresh inherits the caller's persistEnvelopeOnResolve. On the render path that is false, so the SWR background refresh does not persist — keeping this work cleanly out of the scope of the on-demand-persistence follow-up (#1).

No new dedup

Two concurrent stale render calls can each fire a background refresh. This matches offerings — fetchAndCacheOfferings on the stale path isn't deduped either — so it is parity-acceptable. Documented in a code comment rather than guarded.

Test coverage

  • stale hit serves the cached value before the background refresh resolves
  • stale hit fires exactly one background refresh that updates the cache
  • stale hit does not surface a failing background refresh to the caller
  • prefetch path blocks and persists on a stale workflow (no SWR)
  • the existing "re-fetches when cache is stale" test stays green (backend still called), with intent/assertions updated to reflect that the stale value is now served first

Note

Medium Risk
Changes paywall load timing when workflow cache is stale (immediate render vs blocking fetch); prefetch path is explicitly preserved but behavior diverges by caller.

Overview
getWorkflow now matches offerings-style stale-while-revalidate: a fresh hit is unchanged; a stale in-memory hit returns the cached workflow immediately and kicks off a background refetch that only updates the cache (failures are logged, not delivered to the caller); a miss still blocks on the network.

The fetch/resolve/cache path is factored into fetchAndCacheWorkflow. A new staleWhileRevalidate flag (default true for on-demand paywall loads) controls the stale branch; prefetchWorkflow passes false so prefetch still blocks, refetches, and persists detail envelopes when the cache is stale.

Tests cover immediate stale serving, background cache updates, swallowed background errors, blocking behavior when SWR is off, and stale prefetch persistence. WorkflowsCache prune docs are clarified only.

Reviewed by Cursor Bugbot for commit 8509e19. Bugbot is set up for automated code reviews on this repo. Configure here.

@vegaro vegaro added the pr:other label Jun 5, 2026
@codecov

codecov Bot commented Jun 5, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.15385% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 80.18%. Comparing base (a9acd9f) to head (8509e19).

Files with missing lines Patch % Lines
...ecat/purchases/common/workflows/WorkflowManager.kt 96.15% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3540      +/-   ##
==========================================
+ Coverage   80.15%   80.18%   +0.02%     
==========================================
  Files         371      371              
  Lines       15210    15233      +23     
  Branches     2110     2111       +1     
==========================================
+ Hits        12192    12214      +22     
- Misses       2168     2169       +1     
  Partials      850      850              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vegaro vegaro force-pushed the cesar/atlanta-v1 branch from b7b33bf to bc39f94 Compare June 5, 2026 06:08
@vegaro vegaro force-pushed the cesar/workflow-detail-swr branch from d595495 to 8e422d6 Compare June 5, 2026 06:08

vegaro commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

* linger on disk.
*/
@Synchronized
private fun pruneWorkflowDetailEnvelopesToList(workflowIds: Set<String>) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be worth a note here saying the store is per-API-key and not per-user.

I guess its not a big deal, but for any PR about this topic makes me wonder if we should be more defensive.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the store is per-API-key, not per-user, but that’s safe here. IdentityManager calls workflowsCache.clearCache() on every logIn/logOut transition, which wipes the entire disk envelope store before the new user’s data lands. So the store always reflects the current user.

Looks like there's one existing race (a prefetch completing after the clear can repopulate the store with old data). But it looks like it's the same we have in offerings, so maybe something we can tackle in a future PR?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

onSuccess: (WorkflowDataResult) -> Unit,
onError: (PurchasesError) -> Unit,
callbackDispatcher: Dispatcher? = null,
persistEnvelopeOnResolve: Boolean = false,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why default false?

@vegaro vegaro Jun 8, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kdoc explains the reasoning: on-demand fetches (not automatically prefetched) skip persistence to avoid unbounded disk growth. A user could open many distinct paywalls that are not prefetched, and there's no LRU cap today (will come in a future PR). The idea is to only store workflows that are prefetched.

Comment on lines +170 to +171
* bounded by wholesale-replacing its single response blob: the persisted set always equals what
* the latest backend response says exists, so workflows the backend stopped sending don't

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only only prefetch = true workflows that successfully resolved are ever persisted, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. Only cacheWorkflowDetailEnvelope writes to this store, and it is called only from the prefetch path after a successful resolve. Updated the pruneWorkflowDetailEnvelopesToList kdoc to make that invariant explicit.

Base automatically changed from cesar/atlanta-v1 to main June 8, 2026 10:47
vegaro and others added 7 commits June 8, 2026 12:52
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tale prefetch still persists

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vegaro vegaro force-pushed the cesar/workflow-detail-swr branch from 8e422d6 to 8509e19 Compare June 8, 2026 10:54
@vegaro vegaro added this pull request to the merge queue Jun 8, 2026
Merged via the queue into main with commit 689b9e7 Jun 8, 2026
42 checks passed
@vegaro vegaro deleted the cesar/workflow-detail-swr branch June 8, 2026 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants