-
Notifications
You must be signed in to change notification settings - Fork 4.5k
core: Use file stat data to fingerprint assets #21297
Description
Describe the feature
When fingerprinting large files, use the file stat data (inode, mtime, size) to fingerprint, rather than file content, to reduce the amount of time spent fingerprinting.
Use Case
We've been having issues with slow builds of our CDK projects for some time now. When profiling to identify the cause of these performance issues, I noticed that a significant portion of our execution time was going to asset fingerprinting, and specifically to the digest operation that occurs during fingerprinting. In particular, we are fingerprinting the same relatively large (>300MB) source files multiple times in both our tests and production synthesis.
Proposed Solution
I have a PR that I plan to publish for this feature request soon.
Note that this feature may result in additional false negatives for asset caching, for customers who fingerprint large assets that are in different files (or where the mtime changes between fingerprints), or where very large assets differ in LR/CRLF. It's unlikely to result in false positives, unless the mtime is deliberately manipulated.
Other Information
No response
Acknowledgements
- I may be able to implement this feature request
- This feature might incur a breaking change
CDK version used
git-f66f94e9201b9c9d5e0f1b713a6f30194b323b28
Environment details (OS name and version, etc.)
Linux (AL2)