Skip to content

RFC: 0006 monocdk#122

Merged
eladb merged 25 commits intomasterfrom
benisrae/monolithic-packaging
Apr 10, 2020
Merged

RFC: 0006 monocdk#122
eladb merged 25 commits intomasterfrom
benisrae/monolithic-packaging

Conversation

@eladb
Copy link
Copy Markdown
Contributor

@eladb eladb commented Feb 13, 2020

Proposal to distribute the AWS CDK as a single module instead of 150+
modules in order to allow third-party CDK modules to declare their dependency on
the AWS CDK as a peer dependency (rendered version).

Related to #6


By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache-2.0 license

Initial draft with many missing sections.
@eladb eladb mentioned this pull request Feb 13, 2020
7 tasks
@aws aws deleted a comment from rix0rrr Feb 13, 2020
@aws aws deleted a comment from rix0rrr Feb 13, 2020
@eladb eladb mentioned this pull request Feb 23, 2020
4 tasks
This was referenced Feb 27, 2020
@CaerusKaru
Copy link
Copy Markdown

Another point to consider is that the AWS SDK for Java recommends modular imports. They explicitly recommend against a singular dependency import. Source

@eladb
Copy link
Copy Markdown
Contributor Author

eladb commented Feb 27, 2020

Another point to consider is that the AWS SDK for Java recommends modular imports. They explicitly recommend a gainst a singular dependency import. Source

There is a note about this in the RFC. The SDKs are designed as runtime library dependencies and the AWS CDK is a framework designed to run at build time.

The main reason the SDKs are bundled separately are to reduce the runtime footprint and that’s a legitimate concern when it comes to a library you link against. CDK users don’t perceive it as a library, they perceive it as a framework. CDK apps and libraries cannot exist without the CDK, which is not true for application code which uses the SDKs to interact with AWS resources, but could, for example, move to a different cloud and still have the same functionality.

@eladb
Copy link
Copy Markdown
Contributor Author

eladb commented Feb 27, 2020

Added some additional notes in my previous comments.

Elad Ben-Israel and others added 8 commits February 27, 2020 23:01
Co-Authored-By: CaerusKaru <caerus.karu@gmail.com>
Co-Authored-By: CaerusKaru <caerus.karu@gmail.com>
Co-Authored-By: CaerusKaru <caerus.karu@gmail.com>
Co-Authored-By: CaerusKaru <caerus.karu@gmail.com>
Co-Authored-By: CaerusKaru <caerus.karu@gmail.com>
Co-Authored-By: CaerusKaru <caerus.karu@gmail.com>
Co-Authored-By: CaerusKaru <caerus.karu@gmail.com>
Co-Authored-By: CaerusKaru <caerus.karu@gmail.com>
@CaerusKaru
Copy link
Copy Markdown

CaerusKaru commented Feb 27, 2020

CDK apps and libraries cannot exist without the CDK, which is not true for application code which uses the SDKs to interact with AWS resources, but could, for example, move to a different cloud and still have the same functionality.

Strictly speaking I don't agree with this. While it's true that right now the vast majority of CDK projects only use the CDK, it's entirely possible to mix infrastructure in your project with other things, or to have multiple types of infrastructure altogether. For instance, you could have a static website project with CDK and just swap it out for CloudFormation, Serverless, or any other infrastructure provider (as they do today).

As for the libraries, you can't have an SDK library without the SDK library. That's just a truism.

@eladb eladb changed the title RFC: 0006 monolithic packaging RFC: 0006 monocdk Apr 2, 2020
@john-tipper
Copy link
Copy Markdown

We use a subset of the CDK to create infra on demand for customers as part of a SaaS. The CDK is executed within a Lambda (coded in Java).

Am I right in thinking that this change would now require us to bundle all of the CDK into our Lambda?

If so, that’s going to come up against deployment size limits and make our Lambda bigger and slower than it needs to be. I’d really rather this proposal didn’t happen, please.

@misterjoshua
Copy link
Copy Markdown

A small DX perspective from me: I’d like to advocate for incorporating aws and cdk into the constructs package name somehow.

Having aws and cdk in a package name identifies a package as being related to both AWS and CDK. constructs has neither string in its name, so when looking at the name of the constructs package without foreknowledge of the relationship, it's not obvious that constructs is related at all to aws-cdk-lib, much less that constructs is the core programming model backing aws-cdk-lib.

For comparison, Eslint requires an eslint-plugin prefix for plugin names. I think it’s clear that a plugin name like eslint-plugin-prettier is an eslint plugin. Babel has incorporated a similar naming scheme. At least from my experience, this consistency helps me when I’m pruning my package.json to see what dependencies are related.

My suggestion would be to pick a package name like aws-cdk-core instead of constructs for the core model.

@eladb
Copy link
Copy Markdown
Contributor Author

eladb commented Apr 4, 2020

A small DX perspective from me: I’d like to advocate for incorporating aws and cdk into the constructs package name somehow.

Having aws and cdk in a package name identifies a package as being related to both AWS and CDK. constructs has neither string in its name, so when looking at the name of the constructs package without foreknowledge of the relationship, it's not obvious that constructs is related at all to aws-cdk-lib, much less that constructs is the core programming model backing aws-cdk-lib.

For comparison, Eslint requires an eslint-plugin prefix for plugin names. I think it’s clear that a plugin name like eslint-plugin-prettier is an eslint plugin. Babel has incorporated a similar naming scheme. At least from my experience, this consistency helps me when I’m pruning my package.json to see what dependencies are related.

My suggestion would be to pick a package name like aws-cdk-core instead of constructs for the core model.

We intentionally chose a name that does not include AWS or CDK in order to enable new CDK use cases like cdk8s.

@eladb
Copy link
Copy Markdown
Contributor Author

eladb commented Apr 4, 2020

We use a subset of the CDK to create infra on demand for customers as part of a SaaS. The CDK is executed within a Lambda (coded in Java).

Am I right in thinking that this change would now require us to bundle all of the CDK into our Lambda?

If so, that’s going to come up against deployment size limits and make our Lambda bigger and slower than it needs to be. I’d really rather this proposal didn’t happen, please.

As discussed over Twitter, we expect the monolithic CDK library not to exceed 20MiB, and lambda deployment size limit is 250MiB. Do you still think this will be a major barrier?

@mrgrain
Copy link
Copy Markdown
Contributor

mrgrain commented Apr 8, 2020

Not sure what the implications would be, but could this change be in addition to the existing setup and modularised packages still be published?

@eladb
Copy link
Copy Markdown
Contributor Author

eladb commented Apr 8, 2020

Not sure what the implications would be, but could this change be in addition to the existing setup and modularised packages still be published?

Can you describe your use case for multiple modules?

As described in the rfc, we don’t see much value in continuing to release the individual modules because this will cause the 3rd party library story to break. 3rd party libraries will have to take a dependency on the monocdk which means that any user who depends on any third party library will have to also depend on the monocdk as well.

There are some really powerful construct libraries in the works (both from within amazon and outside) and we want to make sure that using them is a first-class experience.

@mrgrain
Copy link
Copy Markdown
Contributor

mrgrain commented Apr 8, 2020

Nah you are right, I missed the point about 3rd party libraries. Was really just wondering why we can't have both. I guess it's interesting to see that some tools, e.g. the aws-sdk are moving towards multiple modules and CDK is moving back to a monolith.

But I thought a bit more about use cases like including it in a lambda and that's nothing that can't be resolved with a good bundler (that removes out unused code).

@eladb
Copy link
Copy Markdown
Contributor Author

eladb commented Apr 8, 2020

e.g. the aws-sdk are moving towards multiple modules and CDK is moving back to a monolith.

This is also somewhat discussed in the RFC. Basically the major difference is that the SDKs are used as libraries and can be hosted in all kinds of apps with a huge range of constraints (e.g. performance, memory, load time, disk, etc). The CDK is used as a "framework" and mostly consumed by "CDK Apps" which are normally executed from build environments, and therefore with much less constraints.

Elad Ben-Israel and others added 3 commits April 11, 2020 00:10
@eladb eladb requested a review from MrArnoldPalmer April 10, 2020 21:13
@eladb eladb merged commit 3b70bc0 into master Apr 10, 2020
@eladb eladb deleted the benisrae/monolithic-packaging branch April 10, 2020 21:16
@mikegwhit
Copy link
Copy Markdown

mikegwhit commented Apr 14, 2020

My understanding of this RFC (reading the referenced #6 ) is that two differing package versions will cause conflicts. I elaborated in an issue ticket that another use case is when compiling TypeScript against a globally installed module. I lack TypeScript knack, but somehow I suspect the latter use case is solveable with a type definition file.

Focusing on just the version differences, it seems like the root cause is that NPM will install multiple versions of packages. I'd first want to know why peerDependencies isn't an option. A remark is that this would force developers into a practice that is non-standard. It would not be unimaginable to provide a smoke test and friendly console message if there are mismatched versions in a node_modules directory. Such a message could guide the user to installing AWS CDK as a set of peerDependencies. An even more invasive version of this would automagically write to a user's package.json file but this could understandably breach trust for some or worse get flagged for security concerns.

It seems like a much heavier version of this resolution is to implement package management within AWS CDK. This could look like cdk i [MODULE]. This would circumvent needing to package CDK as a monolithic module. I am experimenting with ideas along these lines in my own software (more for licensing concerns). This could manage and contain all CDK modules within the single version of CDK. This method successfully would abstract CDK submodule versions from the problems introduced by mismatched versions within NPM. The downside is that you'd need to track any new breaking changes in NPM (probability seems low that they'll break package management). You have to write the code to install subpackages yourself (really, it' just you installing a package programmatically, then moving it to a different folder). It's a heavier effort, but you result with one single CDK version and you can more precisely control interaction of dependencies.

It seems like the third option proposed here is to simply install everything (which amounts to 20mb). I would propose concerns relating to wanting to stage a CDK installation perhaps within a CICD environment. I think someone mentioned this in context of Lambda but it seems more accurate to reference in context of a CICD since a usual setup will install CDK dependencies fresh on each build (even if from a cache in the ideal circumstance?).

Am guessing these thoughts have all been kicked around but that's my view of this problem space. I've looked at this problem space quite a bit for my own work and landed on option #2 (the heavy handed solution) because I have licensing concerns for my code, i.e. I want to do a key exchange before installation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants