Skip to content

Fix dev dependency exclusion bug#3889

Merged
eahefnawy merged 8 commits intomasterfrom
fix-dev-dependency-exclusion-bug
Jul 19, 2017
Merged

Fix dev dependency exclusion bug#3889
eahefnawy merged 8 commits intomasterfrom
fix-dev-dependency-exclusion-bug

Conversation

@pmuens
Copy link
Contributor

@pmuens pmuens commented Jul 3, 2017

What did you implement:

Closes #3827
Closes #3874
Closes #3869

Fixes a bug where the dev dependency exclusion errors out due to a large number of dependencies (and string capacity limitations).

Also fixes the error where exclusion is not working anymore.

Speed is increased due to async operations ran concurrently.

How did you implement it:

  1. Switch from sync to async calls and compute them in a Bluebird Promise.map
  2. Switched from execSync to spawn to get a stream (mitigates the buffer limit execSync introduces where the packages for exclusion are capped)
  3. Switch from npm ls --prod=true to npm ls --dev=true so that we exclude only the dev dependencies rather than excluding all node_modules and then re-including the prod dependencies. This should avoid side-effects when the exclusion of dev dependencies fails.

How can we verify it:

Take the steps described here: #3737 (comment)

Todos:

  • Write tests
  • Fix linting errors
  • Make sure code coverage hasn't dropped
  • Provide verification config / commands / resources
  • Update the messages below

Is this ready for review?: YES
Is it a breaking change?: NO

@pmuens pmuens added this to the 1.18 milestone Jul 3, 2017
@pmuens
Copy link
Contributor Author

pmuens commented Jul 4, 2017

Just a quick heads up. I investigated futher into this issue today and found out that we can use a solution which won't introduce an additional temp file.

It looks like using spawn could solve the buffer overflow problem since it returns a stream we can then react on.

I'll update this PR the upcoming days with a solution for this.

@pmuens
Copy link
Contributor Author

pmuens commented Jul 5, 2017

@ryanmurakami @cjelger @arabold @sbstjn @ajkerr I just pushed a fix for the issues with the current dev dependency exclusion.

The changes which were applied can be found in the PR description.

Would be nice if you could take this for a test drive and see if it resolves all the issues you brought up. This way we can get this PR in a mergeable state and publish it in a patch release the upcoming days.

Thanks in advance 👍

@pmuens pmuens force-pushed the fix-dev-dependency-exclusion-bug branch from 38aec17 to 195f283 Compare July 5, 2017 12:12
@ajkerr
Copy link

ajkerr commented Jul 5, 2017

@pmuens The PR fixes the issues we were seeing with our project. Thanks!

@pmuens
Copy link
Contributor Author

pmuens commented Jul 5, 2017

@pmuens The PR fixes the issues we were seeing with our project. Thanks!

Great! 🎉 Thanks for confirming @ajkerr 👍

// after that re-include the production relevant modules
exclude = _.union(exclude, [`${pathToDep}node_modules/**`]);
include = _.union(include, includePatterns);
childProc.stdout.on('error', () => resolve(result));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be reasonable to abort in case of error and leave the exclude and includes untouched?

Copy link
Contributor Author

@pmuens pmuens Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that sounds like good idea @arabold 👍

let include = [];
const exAndIn = {
include: [],
exlucde: [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: exclude

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops! Nice catch @ryanmurakami 👍

@ryanmurakami
Copy link
Contributor

@pmuens Same result but worse. :-/ Since you changed to looking at dev dependencies, the output stops and one of the lines ends up excluding the entire node_modules folder. Smaller package size for sure. :-)

@ryanmurakami
Copy link
Contributor

I'm going to mess around with using a temp file, see if I can make any progress on that.

@HyperBrain
Copy link
Contributor

@pmuens Didn't you have a temp file version before? Maybe reverting to that would fix @ryanmurakami 's issue.

@ryanmurakami
Copy link
Contributor

@pmuens @HyperBrain So, it looks like using a tempfile fixes the issue.
I tried two methods. The first, hooking up a file write stream to the child process's stdout doesn't work because it still dies at 8kb.
The second is to just add > deps.txt to the end of the command. This works and sends everything to a file. Then I can just read from the file.

I can push that code, but there are a couple of concerns. The package command, with this approach, takes almost 2 minutes to complete, which is way too long. I switched the ls to look at production dependencies instead, and it cut the time in half. Oddly, the filesize was slightly larger.

Anyways, I pushed my changes to my repo in this commit: ryanmurakami@f137fbc

I can clean this up if you guys think it's a good approach.

@pmuens
Copy link
Contributor Author

pmuens commented Jul 5, 2017

🤔 thanks for taking another deep dive into this @ryanmurakami 👍

@HyperBrain yes, we had this file based approach (originally brought up by @arabold 🥇). It would be nice if we could replace the file creation and rely on streams (since IO could be unreliable and slow).

We switched from exec to spawn since spawn returns a stream you can tap into whereas exec has a buffer with a fixed size. Maybe I'm missing something here. (here's the post I'm referring to).

Isn't there any way to stream the data from stdout to avoid this buffer size limitation exec introduces?

I'll take a look into your commit @ryanmurakami tomorrow. Would be really nice if we could fix this issue w/o introducing an external file.

@ryanmurakami
Copy link
Contributor

@pmuens I did a little more work and refactored the code so it's a little easier to riff on different approaches: https://github.com/ryanmurakami/serverless/tree/fix-dev-dependency-exclusion-bug

You can modify the getDependenciesStream and getDependenciesFile, then comment/uncomment lines 206 & 207 to switch each approach.

It appears this is some type of buffer paging that macOS is doing on the stdout stream. I don't see any differences when using spawn or exec in terms of the output changing. Both are cutting off at around 8kb whether you're piping stdout or just reading it after the fact.

I haven't been able to think of another approach except for a temp file. The only other thing is if there was some way we could emulate a file but actually write it directly in memory. I'm not even sure that's possible though.

@ryanmurakami
Copy link
Contributor

Also, thanks so much for working hard on this @pmuens 🙏🙏🙏

@arabold
Copy link

arabold commented Jul 5, 2017

The reason both exec and spawn fail to grab the whole output lies in npm itself. npm calls a process.exit() before it has flushed its stdout. That exit signal is catched by exec and spawn which stops reading the buffer immediately (which happens to be after 8k). I don't have the respective issue in npm at hand right now. It was linked in my original commit IIRC.

Long story short: From what I understand only a temp file will fully solve this problem for all platforms.

@ajkerr
Copy link

ajkerr commented Jul 6, 2017

I believe that npm has purged most of their open GitHub bugs (for better or worse), and are only working on npm5+. Given that Node's current LTS is 6.x with npm 3.x, I don't think waiting for an npm fix is a great option.

@pmuens
Copy link
Contributor Author

pmuens commented Jul 6, 2017

Thanks for the in-depth discussion about this 👍

I found @arabold research comment and will link it here: #3737 (comment) (thanks for looking into this @arabold 👍).

Seems like there's no way around a file-based approach.

Next up I'll take a look @ryanmurakami implementation (thanks for the deep dive!) and update this PR so that we can hopefully fix this soon (and release a patch).

Again. Thanks for all the work and discussions on this here 💯!

Edit: This issue might be especially interesting: nodejs/node#6456

@pmuens pmuens force-pushed the fix-dev-dependency-exclusion-bug branch from 195f283 to 9a3d9eb Compare July 6, 2017 12:05
@pmuens
Copy link
Contributor Author

pmuens commented Jul 6, 2017

I just looked through @ryanmurakami branch (thanks for putting this together!) and pushed an update and switched back to the tmp file approach. I kept all the operations async.

Works on my machine. Would be nice if you could try it and see if it fixes the problem.

Another question: Is the exec command (>> foo.file) OS independent and works on all Windows machines? 🤔

/cc @ryanmurakami @arabold

})
.then(() => fs.readFileAsync(nodeDevDepFile))
.then((fileContent) => {
const dependencies = fileContent.toString('utf8').split('\n');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be deduplicated (as it might be a combination of multiple package.json files and could contain duplicates)?

const dependencies = _.uniq(_.split(fileContent.toString('utf8'), '\n'));

I also suggest to use lodash, as otherwise you have to test here that fileContent is not nil, before using toString().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @HyperBrain 👍 I'll update it so that it uses Lodash.

// after that re-include the production relevant modules
exclude = _.union(exclude, [`${pathToDep}node_modules/**`]);
include = _.union(include, includePatterns);
if (dependencies.length) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (!_.isEmpty(dependencies))
...


/* eslint-disable no-use-before-define */
/* eslint-disable no-param-reassign */
/* eslint-disable consistent-return */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which lines are affected by this? Can the implementation be changed to comply to the rule, or is there a specific reason, why this cannot be done?

Copy link
Contributor

@HyperBrain HyperBrain Jul 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addition: Imo it's generally bad to disable linter checks per file. It would be better to put an elslint-disable-line consistent-return at the specific location and comment why this rule override is needed there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @HyperBrain 👍 Yes, that's a good suggestion. I'll add the comments to the lines.

const childProc = childProcess.spawn('npm',
['ls', '--dev=true', '--parseable=true', '--silent'],
return childProcess.execAsync(
`npm ls --dev=true --parseable=true --silent >> ${nodeDevDepFile}`,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you append to ${nodeDevDepFile} then you should make sure to delete it before the BbPromise.mapSeries() above . Otherwise it might have some surprises in store for you. Why not put it in the artifacts folder (i.e. .serverless or whatever is provided in the --package command line) though? I think that folder is cleaned and recreated on every packaging?

Copy link
Contributor Author

@pmuens pmuens Jul 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I thought mapSeries runs sequential so there shouldn't be any race conditions while appending to the file.

Right now the name of the nodeDevDepFile is generated with a hash in the name and stored in the OS tempDir. Putting it in .serverless could also be an option though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is if there's a leftover from a previous run. Thus the file already exists before serverless is started.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The random number you're using does certainly help though. Didn't notice this before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright gotcha. Yes, that's a good catch 👍 .

I re-thought about keeping it in .serverless however IMHO keeping it in a tmpdir which is cleaned up by the OS is slightly better since the .serverless dir could change (although we have utils to get the path) and is used to store deployment artifacts.

@pmuens pmuens force-pushed the fix-dev-dependency-exclusion-bug branch from bb28001 to ed029a3 Compare July 17, 2017 12:33
@pmuens
Copy link
Contributor Author

pmuens commented Jul 17, 2017

Here's a test service which can be used to verify that the package exclusion works correctly.

It consists of 3 different directory levels where each level has a respective package.json file. Just run npm install in the different directories to install the dependencies and dev dependencies.

You can also e.g. only run npm install in the root level to force an exit code 1 (since npm ls in a directory with no installed modules will fail).

/cc @eahefnawy @ryanmurakami @HyperBrain @arabold

dev-dependency-exclusion.zip

@ryanmurakami
Copy link
Contributor

@pmuens Thanks again for your continued work on this!
Testing it on my problem directory and this seems to have solved the issue.

I'm encountering something different, but I think it has to do with how I'm testing and not the actual code. I'm testing the PR by doing an npm link serverless and it appears that the zip process is zipping up the entire serverless directory with the package. Any idea why this is happening or a better way to test the new code? Regardless, I think you've got it with the most recent changes and it should be good to go.

@ryanmurakami
Copy link
Contributor

Okay, I used npx and was able to validate this fix working good.

On a side note, it takes almost 3x longer to package my large project with this fix. I noticed the time is mostly spent after the dependencies are all resolved, most likely when archiver is zipping directories. Not a huge deal considering I think my project is an outlier, but just something I wanted to mention.

@pmuens
Copy link
Contributor Author

pmuens commented Jul 18, 2017

Okay, I used npx and was able to validate this fix working good.

Nice! Thanks for testing it out @ryanmurakami 👍 Great to hear that it finally works on your machine!

On a side note, it takes almost 3x longer to package my large project with this fix. I noticed the time is mostly spent after the dependencies are all resolved, most likely when archiver is zipping directories. Not a huge deal considering I think my project is an outlier, but just something I wanted to mention.

🤔 That's definitely smth. we should look into next up. Thanks for bringing this to our attention! 👍


I pushed some minor changes and fixed / add a couple of tests. This PR is GTM from my side.

/cc @eahefnawy

})
.then(() => fs.readFileAsync(nodeDevDepFile))
.then((fileContent) => {
const dependencies = _.filter(
Copy link
Contributor

@ryanmurakami ryanmurakami Jul 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use _.compact here instead of _.filter?
https://lodash.com/docs/4.17.4#compact

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That's a good idea @ryanmurakami 👍

@eahefnawy
Copy link
Contributor

@pmuens following your instructions, I was able to validate that exclusion works as expected! 👍

G2M! 😊

@eahefnawy eahefnawy merged commit 47df1e4 into master Jul 19, 2017
@pmuens pmuens deleted the fix-dev-dependency-exclusion-bug branch July 19, 2017 12:26
@sasidharkanumuri
Copy link

I deployed https://github.com/react-boilerplate/react-boilerplate.git over lambda in AWS by cloudformation and I wanted to test this app now by lambda function , any ideas on how to test its localhost

@pmuens
Copy link
Contributor Author

pmuens commented Jul 28, 2017

@sasidharkanumuri thanks for commenting 👍

I think the best place to get help for such problems would be the Serverless Gitter of our Forum.

Let us know if you need any help with that 👍

@sasidharkanumuri
Copy link

Thanks @pmuens I just posted my question in the Serverless Gitter forum

@pmuens
Copy link
Contributor Author

pmuens commented Jul 28, 2017

Thanks @pmuens I just posted my question in the Serverless Gitter forum

Great! Thanks for that 👍 Hopefully your problem can be resolved soon!

@andythecoderman
Copy link

andythecoderman commented Nov 2, 2017

I'm still getting the dev dependencies, or at least most of them using serverless v1.23.0 and serverless-plugin-typescript v1.1.3

Do I need a later version to get this branch?

Or it seems packages are being linked / reused or peer dependency more now than nesting under the same package in npm 5 and yarn, could this be part of the issue?

Sorry i'm kind of a noob.

@tommedema
Copy link

tommedema commented Aug 21, 2018

I'm facing a few issues related to this. I've created a sample repo here:

https://github.com/tommedema/serverless-mono-example

Steps to reproduce

  • yarn install
  • cd packages/serverless-random
  • yarn package

Note that the package now has a few dev dependencies which shouldn't be there:

  • ts-node
  • make-error
  • arrify
  • yn
  • .bin/ts-node

I think this may be because serverless is hardcoding calls to npm, even though I don't use npm (I use yarn for its workspaces feature).

@andythecoderman
Copy link

On a new project using the latest NPM vs. yarn and node 8 vs. typescript I didn't see this issue like I was with typescript/yarn/node 6. Not sure if any of those factors made the difference.

What I ended up doing was using gulp to copy my package.json and lock file over to the output directory then running yarn with the --prod flag so it wouldn't install any of the dev dependencies

@tommedema
Copy link

tommedema commented Aug 21, 2018

@andythecoderman I upgraded yarn, npm etc. but still face the same issue.

Re your workaround: when I copy package.json to an output directory, yarn will fail to install packages from the workspace: error An unexpected error occurred: "https://registry.yarnpkg.com/@org%2frandom: Not found"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix dev dependency exclusion bug when too many node modules are used Change in package/exclude usage? Excluding dev dependencies throws Error

9 participants