feat: lazy per-route cache seeding for Workers#653
feat: lazy per-route cache seeding for Workers#653NathanDrake2406 wants to merge 3 commits intocloudflare:mainfrom
Conversation
commit: |
3e588f6 to
1b23050
Compare
- Only seed routes with router: "app" — Pages Router uses different keys (pages:...) and value shape (kind: "PAGES"), so seeding them as APP_PAGE entries would be unreachable at runtime. - Routes without a router field are skipped defensively (pre-cloudflare#653 manifests lack this field, but also lack buildId so populateKV would already skip). - Remove --app-prefix CLI flag — it writes prefixed keys without wiring the same prefix into the generated runtime, creating silently unreadable cache entries. Keep appPrefix on the programmatic PopulateKVOptions for advanced users who configure both sides manually.
- Only seed routes with router: "app" — Pages Router uses different keys (pages:...) and value shape (kind: "PAGES"), so seeding them as APP_PAGE entries would be unreachable at runtime. - Routes without a router field are skipped defensively (pre-cloudflare#653 manifests lack this field, but also lack buildId so populateKV would already skip). - Remove --app-prefix CLI flag — it writes prefixed keys without wiring the same prefix into the generated runtime, creating silently unreadable cache entries. Keep appPrefix on the programmatic PopulateKVOptions for advanced users who configure both sides manually.
- Only seed routes with router: "app" — Pages Router uses different keys (pages:...) and value shape (kind: "PAGES"), so seeding them as APP_PAGE entries would be unreachable at runtime. - Routes without a router field are skipped defensively (pre-cloudflare#653 manifests lack this field, but also lack buildId so populateKV would already skip). - Remove --app-prefix CLI flag — it writes prefixed keys without wiring the same prefix into the generated runtime, creating silently unreadable cache entries. Keep appPrefix on the programmatic PopulateKVOptions for advanced users who configure both sides manually.
On Cloudflare Workers with MemoryCacheHandler (the default when KV is not configured), pre-rendered routes now seed the memory cache on first request via the assets binding instead of triggering a full re-render. - Add seed-cache-workers.ts with lazy per-route seeding, manifest caching per isolate, and concurrent request dedup - Copy prerender files to dist/client/__prerender/ during deploy so they're available via env.ASSETS.fetch() - Add seeding call to generated App Router worker entry - 9 unit tests covering seeding, dedup, manifest caching, and graceful degradation
The Node.js and Workers seed-cache modules duplicated manifest types, revalidateCtx, and cache value construction. Extract to seed-cache-shared.ts to prevent drift between the two strategies. Also fixes stale JSDoc referencing "manifestPromise" (renamed to "loadedPromise").
1b23050 to
8011d47
Compare
|
/bigbonk review |
There was a problem hiding this comment.
Good PR — the lazy per-route seeding approach is well-suited for Workers' ephemeral isolate model, and the shared extraction from seed-cache.ts is a clean refactor. The test coverage is solid. A few issues to address, one of which could cause request failures in production.
Summary of findings:
- P0 (error resilience):
doSeedRoutehas no try/catch, and the generated worker entryawaits without a safety net — a transientenv.ASSETS.fetch()failure orhandler.set()error would crash the request instead of gracefully degrading. - P1 (permanent null caching):
loadManifestpermanently cachesnullon transient fetch failure — the isolate can never seed any route after a single hiccup. - Minor:
copyPrerenderToAssetsflattens the manifest into the same directory as the route files, which works but could cause filename collisions on a route literally named/vinext-prerender(extremely unlikely, but worth noting).
| baseKey, | ||
| revalidateSeconds, | ||
| fetchAsset, | ||
| ).finally(() => seedInFlight.delete(htmlKey)); |
There was a problem hiding this comment.
Error resilience: doSeedRoute can throw on fetchAsset() network errors, handler.set() failures, etc. Since the generated worker entry (line 503 of deploy.ts) does await seedRouteFromAssets(...) without try/catch, any throw here becomes an unhandled error that crashes the request with a 500.
Seeding is a best-effort optimization — a failure should log and degrade gracefully, not break the request. Wrap this in try/catch:
| ).finally(() => seedInFlight.delete(htmlKey)); | |
| const promise = doSeedRoute( | |
| loaded.manifest, | |
| pathname, | |
| baseKey, | |
| revalidateSeconds, | |
| fetchAsset, | |
| ).catch(() => {/* seeding is best-effort */}).finally(() => seedInFlight.delete(htmlKey)); |
Alternatively (and perhaps better), wrap the entire seedRouteFromAssets body in try/catch so that any unexpected error — including from getCacheHandler().get() — is swallowed. The loadManifest function already follows this pattern.
|
|
||
| return { manifest, lookup }; | ||
| } catch { | ||
| return null; |
There was a problem hiding this comment.
Permanent null caching on transient failure: If fetchAsset("/__prerender/vinext-prerender.json") fails transiently (e.g., a brief ASSETS binding hiccup, timeout), loadedPromise is permanently set to a promise that resolves to null. Every subsequent call to seedRouteFromAssets in this isolate will get null and no-op — the isolate can never recover.
Since the manifest is immutable per deployment, a permanent cache of the successful result is correct. But caching the failure permanently is not — a retry on the next request would likely succeed.
Consider only caching successful results:
| return null; | |
| loadedPromise = (async () => { | |
| try { | |
| const res = await fetchAsset("/__prerender/vinext-prerender.json"); | |
| if (!res.ok) return null; | |
| const manifest: PrerenderManifest = await res.json(); | |
| if (!manifest.buildId || !Array.isArray(manifest.routes)) return null; | |
| const lookup = new Map<string, PrerenderManifestRoute>(); | |
| for (const route of manifest.routes) { | |
| if (route.status !== "rendered") continue; | |
| if (route.router !== "app") continue; | |
| lookup.set(route.path ?? route.route, route); | |
| } | |
| return { manifest, lookup }; | |
| } catch { | |
| loadedPromise = null; // allow retry on next request | |
| return null; | |
| } | |
| })(); |
Note: the !res.ok and validation-failure cases are arguably permanent (malformed deployment), so you could choose to only reset loadedPromise in the catch branch. Either way, the catch case (network/timeout) should not be permanently cached.
| const sourceDir = path.join(serverDir, "prerendered-routes"); | ||
| if (!fs.existsSync(sourceDir)) return; | ||
|
|
||
| fs.cpSync(sourceDir, targetDir, { recursive: true }); |
There was a problem hiding this comment.
Nit: fs.cpSync(sourceDir, targetDir, { recursive: true }) copies the contents of sourceDir into targetDir where targetDir already exists. This means vinext-prerender.json (copied on line 1219) and route files (e.g., about.html) all live at the same level under __prerender/. This works, but means a pre-rendered route literally named /vinext-prerender (producing vinext-prerender.html) would coexist fine — just calling it out.
A more robust layout would be __prerender/manifest.json + __prerender/routes/..., but that's a future cleanup, not a blocker.
| } | ||
|
|
||
| /** Build an APP_PAGE cache value for an RSC entry. */ | ||
| export function makeRscCacheValue(rscData: ArrayBuffer): CachedAppPageValue { |
There was a problem hiding this comment.
The rscData parameter is typed as ArrayBuffer, which matches the CachedAppPageValue field type. But in seed-cache.ts:142-145, the Node.js codepath does rscBuffer.buffer.slice(byteOffset, byteOffset + byteLength) to extract a clean ArrayBuffer from the Node Buffer. Worth verifying that the ArrayBuffer from Response.arrayBuffer() in the Workers path (line 179) doesn't need similar treatment — in practice Workers' Response.arrayBuffer() returns a standalone ArrayBuffer with byteOffset === 0, so this should be fine. Just flagging the asymmetry.
| // On first request for a pre-rendered page, fetches HTML/RSC from the | ||
| // assets binding and inserts into the cache. Subsequent requests in the | ||
| // same isolate get cache HITs. No-op for non-prerendered routes. | ||
| await seedRouteFromAssets(url.pathname, (p) => env.ASSETS.fetch(new Request(new URL(p, request.url)))); |
There was a problem hiding this comment.
This await is on the critical path of every request — it blocks the response until seeding completes for pre-rendered routes. For the first cold request this adds the latency of two env.ASSETS.fetch() calls (HTML + RSC) plus two handler.set() calls.
This is probably acceptable since env.ASSETS.fetch() is local to the colo and fast, but worth a comment noting that this is intentionally blocking (to guarantee the cache is populated before the RSC handler runs). An alternative would be to let the RSC handler serve the request normally while seeding in the background via ctx.waitUntil(), but that would mean the first request always re-renders — so blocking is the right call here.
Also: should this be wrapped in try/catch as defense-in-depth? If seedRouteFromAssets throws (see other comment), the request dies.
| await seedRouteFromAssets(url.pathname, (p) => env.ASSETS.fetch(new Request(new URL(p, request.url)))); | |
| // Seed the memory cache from pre-rendered assets (lazy, per-route). | |
| // On first request for a pre-rendered page, fetches HTML/RSC from the | |
| // assets binding and inserts into the cache. Subsequent requests in the | |
| // same isolate get cache HITs. No-op for non-prerendered routes. | |
| // Blocking (not waitUntil) so the RSC handler sees the seeded cache. | |
| try { | |
| await seedRouteFromAssets(url.pathname, (p) => env.ASSETS.fetch(new Request(new URL(p, request.url)))); | |
| } catch { /* seeding is best-effort — fall through to normal render */ } |
|
|
||
| // ── Concurrent dedup ────────────────────────────────────────────────────── | ||
|
|
||
| it("deduplicates concurrent seed requests for the same route", async () => { |
There was a problem hiding this comment.
Good test, but the dedup assertion depends on the mock fetcher resolving synchronously (via microtask). In a real Workers environment, env.ASSETS.fetch() would yield to the event loop, and the timing of when the second caller hits the seedInFlight.get() check vs the getCacheHandler().get() check could differ.
Consider adding a test variant where the mock fetcher introduces a small delay (e.g., a Promise that resolves after a setTimeout(0)) to verify the dedup still works when fetches genuinely yield. This would give more confidence in the concurrency model.
|
Review submitted successfully. Here's a summary of the feedback I posted: Key findings:
|
… seeding - seedRouteFromAssets never throws — errors degrade gracefully instead of crashing the request with a 500 - doSeedRoute failures are caught via .catch() on the shared promise, protecting both the originator and joiners via seedInFlight - loadManifest resets loadedPromise on transient failure (network error, timeout) so the isolate can retry on the next request; permanent failures (404, invalid JSON) remain cached - Added tests: error resilience (fetch failure, cache write failure), transient retry, permanent 404 caching, delayed async dedup
Summary
MemoryCacheHandler(the default when KV is not configured), pre-rendered routes now lazily seed the memory cache on first request viaenv.ASSETS.fetch()Pre-requisite
Depends on #645 for the manifest format changes (
buildId,router,trailingSlashfields).How it works
deploy.ts): After prerendering, copiesvinext-prerender.jsonand HTML/RSC files todist/client/__prerender/so they're deployed as static assetsseed-cache-workers.ts): On each request, checks if the route has prerendered data. On first miss, fetches HTML/RSC from the assets binding and populatesMemoryCacheHandler. Subsequent requests in the same isolate get cache HITsseedRouteFromAssets()before delegating to the RSC handlerChanges
src/server/seed-cache-workers.tssrc/deploy.tspackage.json./server/seed-cache-workerstests/seed-cache-workers.test.tsScope
This is Workers memory stub support only, not the final durable caching story. For persistent caching, users should use
KVCacheHandler(opt-in) with TPR for deploy-time KV population. Issue #562 tracks the generalised remote cache seeding.Ref #561
Test plan