Conversation
Failing test suitesCommit: 72af8ff | About building and testing Next.js
Expand output● server-components-hmr-cache › should support reading from an infinite streaming fetch |
Stats from current PR✅ No significant changes detected📊 All Metrics📖 Metrics GlossaryDev Server Metrics:
Build Metrics:
Change Thresholds:
⚡ Dev Server
📦 Dev Server (Webpack) (Legacy)📦 Dev Server (Webpack)
⚡ Production Builds
📦 Production Builds (Webpack) (Legacy)📦 Production Builds (Webpack)
📦 Bundle SizesBundle Sizes⚡ TurbopackClient Main Bundles: **402 kB** → **402 kB**
|
| Canary | PR | Change | |
|---|---|---|---|
| middleware-b..fest.js gzip | 766 B | 767 B | ✓ |
| Total | 766 B | 767 B |
Build Details
Build Manifests
| Canary | PR | Change | |
|---|---|---|---|
| _buildManifest.js gzip | 446 B | 450 B | ✓ |
| Total | 446 B | 450 B |
📦 Webpack
Client
Main Bundles
| Canary | PR | Change | |
|---|---|---|---|
| 5528-HASH.js gzip | 5.54 kB | N/A | - |
| 6280-HASH.js gzip | 59.4 kB | N/A | - |
| 6335.HASH.js gzip | 169 B | N/A | - |
| 912-HASH.js gzip | 4.59 kB | N/A | - |
| e8aec2e4-HASH.js gzip | 62.6 kB | N/A | - |
| framework-HASH.js gzip | 59.7 kB | 59.7 kB | ✓ |
| main-app-HASH.js gzip | 255 B | 254 B | ✓ |
| main-HASH.js gzip | 39.1 kB | 39.1 kB | ✓ |
| webpack-HASH.js gzip | 1.68 kB | 1.68 kB | ✓ |
| 262-HASH.js gzip | N/A | 4.59 kB | - |
| 2889.HASH.js gzip | N/A | 169 B | - |
| 5602-HASH.js gzip | N/A | 5.55 kB | - |
| 6948ada0-HASH.js gzip | N/A | 62.6 kB | - |
| 9544-HASH.js gzip | N/A | 60.2 kB | - |
| Total | 233 kB | 234 kB |
Polyfills
| Canary | PR | Change | |
|---|---|---|---|
| polyfills-HASH.js gzip | 39.4 kB | 39.4 kB | ✓ |
| Total | 39.4 kB | 39.4 kB | ✓ |
Pages
| Canary | PR | Change | |
|---|---|---|---|
| _app-HASH.js gzip | 194 B | 194 B | ✓ |
| _error-HASH.js gzip | 183 B | 180 B | 🟢 3 B (-2%) |
| css-HASH.js gzip | 331 B | 330 B | ✓ |
| dynamic-HASH.js gzip | 1.81 kB | 1.81 kB | ✓ |
| edge-ssr-HASH.js gzip | 256 B | 256 B | ✓ |
| head-HASH.js gzip | 351 B | 352 B | ✓ |
| hooks-HASH.js gzip | 384 B | 383 B | ✓ |
| image-HASH.js gzip | 580 B | 581 B | ✓ |
| index-HASH.js gzip | 260 B | 260 B | ✓ |
| link-HASH.js gzip | 2.51 kB | 2.51 kB | ✓ |
| routerDirect..HASH.js gzip | 320 B | 319 B | ✓ |
| script-HASH.js gzip | 386 B | 386 B | ✓ |
| withRouter-HASH.js gzip | 315 B | 315 B | ✓ |
| 1afbb74e6ecf..834.css gzip | 106 B | 106 B | ✓ |
| Total | 7.98 kB | 7.98 kB | ✅ -1 B |
Server
Edge SSR
| Canary | PR | Change | |
|---|---|---|---|
| edge-ssr.js gzip | 125 kB | 125 kB | ✓ |
| page.js gzip | 256 kB | 256 kB | ✓ |
| Total | 380 kB | 381 kB |
Middleware
| Canary | PR | Change | |
|---|---|---|---|
| middleware-b..fest.js gzip | 618 B | 617 B | ✓ |
| middleware-r..fest.js gzip | 156 B | 155 B | ✓ |
| middleware.js gzip | 43.6 kB | 43.9 kB | ✓ |
| edge-runtime..pack.js gzip | 842 B | 842 B | ✓ |
| Total | 45.2 kB | 45.5 kB |
Build Details
Build Manifests
| Canary | PR | Change | |
|---|---|---|---|
| _buildManifest.js gzip | 715 B | 718 B | ✓ |
| Total | 715 B | 718 B |
Build Cache
| Canary | PR | Change | |
|---|---|---|---|
| 0.pack gzip | 4.07 MB | 4.07 MB | ✓ |
| index.pack gzip | 103 kB | 102 kB | ✓ |
| index.pack.old gzip | 103 kB | 103 kB | ✓ |
| Total | 4.27 MB | 4.28 MB |
🔄 Shared (bundler-independent)
Runtimes
| Canary | PR | Change | |
|---|---|---|---|
| app-page-exp...dev.js gzip | 322 kB | 322 kB | ✓ |
| app-page-exp..prod.js gzip | 171 kB | 171 kB | ✓ |
| app-page-tur...dev.js gzip | 322 kB | 322 kB | ✓ |
| app-page-tur..prod.js gzip | 171 kB | 171 kB | ✓ |
| app-page-tur...dev.js gzip | 318 kB | 318 kB | ✓ |
| app-page-tur..prod.js gzip | 169 kB | 169 kB | ✓ |
| app-page.run...dev.js gzip | 319 kB | 319 kB | ✓ |
| app-page.run..prod.js gzip | 169 kB | 169 kB | ✓ |
| app-route-ex...dev.js gzip | 70.9 kB | 70.9 kB | ✓ |
| app-route-ex..prod.js gzip | 49.3 kB | 49.3 kB | ✓ |
| app-route-tu...dev.js gzip | 70.9 kB | 70.9 kB | ✓ |
| app-route-tu..prod.js gzip | 49.3 kB | 49.3 kB | ✓ |
| app-route-tu...dev.js gzip | 70.5 kB | 70.5 kB | ✓ |
| app-route-tu..prod.js gzip | 49 kB | 49 kB | ✓ |
| app-route.ru...dev.js gzip | 70.4 kB | 70.4 kB | ✓ |
| app-route.ru..prod.js gzip | 49 kB | 49 kB | ✓ |
| dist_client_...dev.js gzip | 324 B | 324 B | ✓ |
| dist_client_...dev.js gzip | 326 B | 326 B | ✓ |
| dist_client_...dev.js gzip | 318 B | 318 B | ✓ |
| dist_client_...dev.js gzip | 317 B | 317 B | ✓ |
| pages-api-tu...dev.js gzip | 43.2 kB | 43.2 kB | ✓ |
| pages-api-tu..prod.js gzip | 32.9 kB | 32.9 kB | ✓ |
| pages-api.ru...dev.js gzip | 43.2 kB | 43.2 kB | ✓ |
| pages-api.ru..prod.js gzip | 32.9 kB | 32.9 kB | ✓ |
| pages-turbo....dev.js gzip | 52.6 kB | 52.6 kB | ✓ |
| pages-turbo...prod.js gzip | 38.5 kB | 38.5 kB | ✓ |
| pages.runtim...dev.js gzip | 52.6 kB | 52.6 kB | ✓ |
| pages.runtim..prod.js gzip | 38.5 kB | 38.5 kB | ✓ |
| server.runti..prod.js gzip | 62 kB | 62 kB | ✓ |
| Total | 2.84 MB | 2.84 MB |
📎 Tarball URL
https://vercel-packages.vercel.app/next/commits/7f829743dc4590639419b8c9d403eae09cd737d7/next
e5c9cae to
a5a1ed2
Compare
we can iterate on this, but this is can definitely cause spurious failures when we have rust changes on canary that weren't published yet, so we're gonna have to deal with it at some point |
run-evals.js
Outdated
| const flags = argv.filter((a) => a.startsWith('-')) | ||
| const positional = argv.filter((a) => !a.startsWith('-')) |
There was a problem hiding this comment.
can we please use some argument parsing package instead of this
| Then edit three files: | ||
|
|
||
| **`PROMPT.md`** — what you'd type into the agent. Write it like a real user would: describe the symptom or goal, not the API. "Navigating from `/a` to `/b` is slow, fix it" is a good prompt. "Use `unstable_instant`" is not — you're testing whether the agent understands the feature well enough to reach for it, not whether it can pattern-match a name you handed it. | ||
|
|
||
| **`EVAL.ts`** — vitest assertions against files the agent wrote. Regex the source, don't run it. |
There was a problem hiding this comment.
this whole convention seems to come from @vercel-labs/agent-eval, which isn't mentioned anywhere in this README. for someone like me who hasn't worked with this stuff at all, it's not even clear that that's what we're doing without reading the runner code. perhaps this README should mention that this is what we're using, and link to the docs for @vercel-labs/agent-eval in addition to inlining the relevant parts here?
There was a problem hiding this comment.
Agreed — the README explains the eval convention (PROMPT.md, EVAL.ts, fixture dirs) without mentioning that it's all driven by @vercel/agent-eval. Adding a brief "How it works" section that names the package, links to its docs, and explains the relationship between the generated experiments and the runner would make this much more approachable for someone encountering it for the first time.
evals/README.md
Outdated
|
|
||
| ## Running without Vercel sandbox access | ||
|
|
||
| If you don't have Vercel credentials, the runner falls back to local Docker. Have Docker running and provide your own model key in `.env.local` at the repo root: |
There was a problem hiding this comment.
similar to above: inlining docs is nice, but reference links are nicer https://github.com/vercel-labs/agent-eval#direct-api-keys-no-vercel-account-required
as in, it'd be good to mention which part of this setup is the "runner" here. i went looking for a dockerfile in the nextjs repo because i didn't know who's doing that
.config/eslintignore.mjs
Outdated
| // Eval fixtures are deliberately imperfect code for agents to fix; EVAL.ts | ||
| // uses vitest (not jest) and comes from an external repo. | ||
| 'evals/evals/**/*', |
There was a problem hiding this comment.
what does jest have to do with anything? this is an eslint config. and EVAL.ts is, uh, not in an external repo! anyway, we should still be linting EVAL.ts because that's not part of the "imperfect" code, no?
Fixtures now live next to the code they test, like e2e. `pnpm eval <name>` packs the local `next` build, generates baseline + agents-md experiment configs on the fly, and runs both in a sandbox. The agents-md variant drops an `AGENTS.md` that points the agent at the bundled docs in `node_modules/next/dist/docs/` — comparing the two variants tells you whether shipping a doc actually changes agent behavior. `run-evals.js` mirrors `run-tests.js`: pack once, pass the tarball path to the child via `NEXT_EVAL_TARBALL` env, forward flags. We only pack `next`, not the whole workspace — the sandbox is remote Linux, so a local `@next/swc` darwin binary wouldn't run there anyway; the sandbox downloads the right one at runtime. The experiment config uses `sandbox: 'auto'`, which picks Vercel sandboxes when credentials are present and falls back to local Docker otherwise, so external contributors can run the same evals with just Docker + `ANTHROPIC_API_KEY`. `experiments/` is generated fresh each run and gitignored so we don't maintain N committed config files that differ by one line. Fixture code is excluded from eslint since it's deliberately imperfect code for agents to fix, and `EVAL.ts` uses vitest rather than jest. Fixture `package.json` files use `"next": "^16"` rather than a pinned canary so agents reading `package.json` to infer capabilities aren't misled by a stale version string; the tarball install overlays it regardless. next-evals-oss stays as the full benchmark runner for nextjs.org/evals; it'll pull fixtures from here instead of keeping its own copy.
Fixtures now live next to the code they test, like e2e.
pnpm eval <name>packs the localnextbuild, generates baseline + agents-md experiment configs on the fly, and runs both in a sandbox. The agents-md variant drops anAGENTS.mdthat points the agent at the bundled docs innode_modules/next/dist/docs/— comparing the two variants tells you whether shipping a doc actually changes agent behavior.run-evals.jsmirrorsrun-tests.js: pack once, pass the tarball path to the child viaNEXT_EVAL_TARBALLenv, forward flags. We only packnext, not the whole workspace — the sandbox is remote Linux, so a local@next/swcdarwin binary wouldn't run there anyway; the sandbox downloads the right one at runtime. The experiment config usessandbox: 'auto', which picks Vercel sandboxes when credentials are present and falls back to local Docker otherwise, so external contributors can run the same evals with just Docker +ANTHROPIC_API_KEY.experiments/is generated fresh each run and gitignored so we don't maintain N committed config files that differ by one line. Fixture code is excluded from eslint since it's deliberately imperfect code for agents to fix, andEVAL.tsuses vitest rather than jest. Fixturepackage.jsonfiles use"next": "^16"rather than a pinned canary so agents readingpackage.jsonto infer capabilities aren't misled by a stale version string; the tarball install overlays it regardless.next-evals-oss stays as the full benchmark runner for nextjs.org/evals; it'll pull fixtures from here instead of keeping its own copy.