Skip to content

feat: dedupe packages that have peers when dedupe-peer-dependents is true#6153

Merged
zkochan merged 2 commits into
mainfrom
peers-dedupe
Mar 3, 2023
Merged

feat: dedupe packages that have peers when dedupe-peer-dependents is true#6153
zkochan merged 2 commits into
mainfrom
peers-dedupe

Conversation

@zkochan

@zkochan zkochan commented Feb 28, 2023

Copy link
Copy Markdown
Member

A new setting is now supported: dedupe-peer-dependents.

When this setting is set to true, packages with peer dependencies will be deduplicated after peers resolution.

For instance, let's say we have a workspace with two projects and both of them have webpack in their dependencies. webpack has esbuild in its optional peer dependencies, and one of the projects has esbuild in its dependencies. In this case, pnpm will link two instances of webpack to the node_modules/.pnpm directory: one with esbuild and another one without it:

node_modules
  .pnpm
    webpack@1.0.0_esbuild@1.0.0
    webpack@1.0.0
project1
  node_modules
    webpack -> ../../node_modules/.pnpm/webpack@1.0.0/node_modules/webpack
project2
  node_modules
    webpack -> ../../node_modules/.pnpm/webpack@1.0.0_esbuild@1.0.0/node_modules/webpack
    esbuild

This makes sense because webpack is used in two projects, and one of the projects doesn't have esbuild, so the two projects cannot share the same instance of webpack. However, this is not what most developers expect, especially since in a hoisted node_modules, there would only be one instance of webpack. Therefore, you may now use the dedupe-peer-dependents setting to deduplicate webpack when it has no conflicting peer dependencies. In this case, if we set dedupe-peer-dependents to true, both projects will use the same webpack instance, which is the one that has esbuild resolved:

node_modules
  .pnpm
    webpack@1.0.0_esbuild@1.0.0
project1
  node_modules
    webpack -> ../../node_modules/.pnpm/webpack@1.0.0_esbuild@1.0.0/node_modules/webpack
project2
  node_modules
    webpack -> ../../node_modules/.pnpm/webpack@1.0.0_esbuild@1.0.0/node_modules/webpack
    esbuild

@zkochan zkochan changed the title fix: automatically install peer dependencies if they are already pres… fix: automatically install peer dependencies if they are already present inside node_modules Feb 28, 2023
@zkochan zkochan changed the title fix: automatically install peer dependencies if they are already present inside node_modules fix: deduplicated packages that have peer dependencies Mar 2, 2023
@zkochan zkochan changed the title fix: deduplicated packages that have peer dependencies fix: deduplicate packages that have peer dependencies Mar 2, 2023
@zkochan zkochan changed the title fix: deduplicate packages that have peer dependencies feat: deduplicate packages that have peer dependencies Mar 2, 2023
@zkochan zkochan force-pushed the peers-dedupe branch 2 times, most recently from 5a7511d to 4addac1 Compare March 3, 2023 04:28
@zkochan zkochan marked this pull request as ready for review March 3, 2023 05:37
@zkochan zkochan requested a review from gluxon as a code owner March 3, 2023 05:37
@gluxon

gluxon commented Mar 3, 2023

Copy link
Copy Markdown
Member

Strongly agree this is the right direction to go. This is one of the most surprising gotchas today with pnpm.

I was about to propose doing this, but you beat me to it! Thanks. 😊

@zkochan

zkochan commented Mar 3, 2023

Copy link
Copy Markdown
Member Author

We can make it default in pnpm v8 but I had to force full resolution when this setting is on. So some performance optimisations will have to be done probably.

@zkochan zkochan requested a review from a team March 3, 2023 06:06
@zkochan

zkochan commented Mar 3, 2023

Copy link
Copy Markdown
Member Author

It is ready for review.

@zkochan zkochan changed the title feat: deduplicate packages that have peer dependencies feat: deduplicate packages that have peer dependencies when dedupe-peer-dependents is true Mar 3, 2023
for (const { directNodeIdsByAlias, id } of opts.projects) {
dependenciesByProjectId[id] = mapValues((nodeId) => pathsByNodeId[nodeId], directNodeIdsByAlias)
}
if (opts.dedupePeerDependents) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this "patches" the resolution after it's occurred.

That works, but it may be worth discussing an alternative approach that behaves similarly to resolve-peers-from-workspace-root, but instead passes in the entire depGraph rather than only workspace root dependencies. Would that work as well?

I'm wondering if there's a way we an avoid recomputing the dep graph again, and have it deduped already when peersOfChildren is called.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you mean to move the dedupe logic to this file: https://github.com/pnpm/pnpm/blob/main/pkg-manager/resolve-dependencies/src/resolveDependencies.ts

I was thinking about it. It was my first idea. However, at that point we don't know yet that the packages are duplicated (we know maybe the ones that were duplicated during previous install). The packages with peer dependencies are duplicated only after peers resolution, which is the next stage. So we can only deduplicate them after peers resolution.

My first idea was to try to tweak the code that does auto installation of peer dependencies and prefer versions that are already present in the lockfile or are already used to resolve peers. But that was a lot harder to implement and less reliable I believe.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you mean to move the dedupe logic to this file: https://github.com/pnpm/pnpm/blob/main/pkg-manager/resolve-dependencies/src/resolveDependencies.ts

Roughly. To elaborate more, I was thinking of adding the full dep graph to the pkgsByName object above, in addition to rootPkgsByName.

I was thinking about it. It was my first idea. However, at that point we don't know yet that the packages are duplicated (we know maybe the ones that were duplicated during previous install). The packages with peer dependencies are duplicated only after peers resolution, which is the next stage. So we can only deduplicate them after peers resolution.

That's true. Adding more available peers to select from during peers resolution would change the result without knowing whether that change was necessary to dedupe. I find something nice about being able to intuit "If it's in the lockfile, it can be a peer.", but I could also see that being undesirable for complicated cases.

Don't have strong opinions here. Your final solution is good. 👍

Thanks for discussing!

@zkochan zkochan changed the title feat: deduplicate packages that have peer dependencies when dedupe-peer-dependents is true feat: dedupe packages that have peers when dedupe-peer-dependents is true Mar 3, 2023
@gluxon

gluxon commented Mar 3, 2023

Copy link
Copy Markdown
Member

We can make it default in pnpm v8

This option plus auto-install-peers will make pnpm v8 a huge user experience improvement.

but I had to force full resolution when this setting is on.

Ah, is this why the approach dedupes after the initial peers resolution rather than during peers resolution?

So some performance optimisations will have to be done probably.

This version should work and I definitely wouldn't let perfect get in the way!

@gluxon gluxon mentioned this pull request Mar 3, 2023
@zkochan

zkochan commented Mar 3, 2023

Copy link
Copy Markdown
Member Author

but I had to force full resolution when this setting is on.

Ah, is this why the approach dedupes after the initial peers resolution rather than during peers resolution?

The issue is that currently we don't dedupe, so when we add a new dependency, we should only analyse a subgraph. And we resolve peer dependencies for the subgraph. We only pass a subgraph to the peers resolver. But now that we dedupe, we always need to pass the whole graph to peers resolver and we will also re-resolve all the peers every time (which might be slow). But I don't see other alternatives. I was thinking about different approaches for 2 or 3 days 😅

@zkochan zkochan merged commit 5c31fa8 into main Mar 3, 2023
@zkochan zkochan deleted the peers-dedupe branch March 3, 2023 06:31
@zkochan zkochan added this to the v7.29 milestone Mar 3, 2023
@gluxon

gluxon commented Mar 3, 2023

Copy link
Copy Markdown
Member

I ran into that same problem researching this independently. My draft approach didn't handle --filter'ed installs. 😬

For what it's worth, I think the performance penalty of looking at the entire graph is well worth it.

There is a possible performance improvement given an index mapping dependencies to other dependencies it can be a peer of. That allows pnpm to avoid reanalyzing the entire graph, but that could be complex to implement for small performance gains.

@zkochan

This comment was marked as outdated.

@zkochan

This comment was marked as outdated.

@zkochan

zkochan commented Mar 3, 2023

Copy link
Copy Markdown
Member Author

🚢 7.29.0-1

@simonihmig

Copy link
Copy Markdown

Is this supposed to fix #6152? (if so we can close the issue I guess)

@zkochan

zkochan commented Mar 3, 2023

Copy link
Copy Markdown
Member Author

@simonihmig yes

@patroza

patroza commented Mar 3, 2023

Copy link
Copy Markdown

what means conflicting peer dependencies?
Different version ranges, or having multiple peer dependencies of which different ones are installed across different repo packages?

@zkochan

zkochan commented Mar 3, 2023

Copy link
Copy Markdown
Member Author

Not different version ranges. The version ranges will be the same. But they are resolved by different versions of the peer.

Like

webpack@1.0.0_esbuild@1.0.0

and

webpack@1.0.0_esbuild@1.1.0_react@18.0.0

In this case we cannot deduplicate webpack because it uses different versions of esbuild.

@zanminkian

Copy link
Copy Markdown

Great work! It's hard to explain to others that optional peerDependencies will make framework library save into duplicate copies. After turning on this flag, developer don't need to concern about the peerDependencies. They only need to ensure that the library versions are the same.
Am I correct? @zkochan

@zkochan

zkochan commented Mar 3, 2023

Copy link
Copy Markdown
Member Author

Sounds correct.

@zkochan

zkochan commented Mar 3, 2023

Copy link
Copy Markdown
Member Author

As @gluxon mentioned, this change basically means that the --filter option will not work with the install command unless the lockfile is up-to-date. I guess it should be fine as most of the people use the --filter command to optimise CI build and in CI the lockfile is up-to-date. Also, people that use the --filter option probably want their projects to be isolated, so it makes sense for them to disable dedupe-peer-dependents.

I need to make another PR that will force resolution on all projects when adding a new dependency to a workspace project. Currently, dedupe will break if you run pnpm add <pkg> in one of the workspace projects.

zkochan added a commit that referenced this pull request Mar 3, 2023
@zanminkian

Copy link
Copy Markdown

It's a little bit complex to understand. Most of developers only want to make sure that, there is only one copy of each dependencies in one application. That means, one application process have only one react instance. Because having more than one react instance in one application will introduce some bug.

I wrote a tool to check.

I hope my tool can help developers like me.

@shellscape

Copy link
Copy Markdown
Contributor

Please make a note of this change in the migration guide for v8

@unional

unional commented Mar 4, 2023

Copy link
Copy Markdown

Have a question about the naming between dependents vs dependencies. Should it be dependencies or there are precedents of using dependents in pnpm?

@zkochan

zkochan commented Mar 4, 2023

Copy link
Copy Markdown
Member Author

it should be dependents because the packages that have peer dependencies are duplicated. So, dependents of peer dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants