feat(core): cache fingerprints of large assets#21321
Conversation
c6142f8 to
3664ad6
Compare
|
Changed to chore as the PR linter doesn't like a feature that doesn't touch the README. Let me know if |
de789cf to
d858559
Compare
| const hash2 = FileSystem.fingerprint(largefile, {}); | ||
|
|
||
| expect(hash1).toEqual(hash2); | ||
| expect(openSyncSpy).toHaveBeenCalledTimes(1); |
There was a problem hiding this comment.
You should restore the mock or clear all mocks in a beforeEach(). Because it is not cleared you end up with 3 calls in your considers mtime test. If this test is moved up in the file for some reason it will fail (test should not rely on testing order).
Also, is there a reason to create a large file for this test? It is the caching mechanism that is tested so it should work on files of any size?
| expect(openSyncSpy).toHaveBeenCalledTimes(1); | |
| expect(openSyncSpy).toHaveBeenCalledTimes(1); | |
| openSyncSpy.mockRestore() |
same for the second test
There was a problem hiding this comment.
Good point, we don't need to use a large file now that this is not sensitive to timing or filesize thresholds. I'll update that as well as adding a mock restore call.
|
|
||
| export function contentFingerprint(file: string): string { | ||
| const stats = fs.statSync(file); | ||
| const cacheKey = JSON.stringify({ mtime: stats.mtime, inode: stats.ino, size: stats.size }); |
There was a problem hiding this comment.
Put a comment here explaining why this is safe on Windows
d858559 to
4b8ecd9
Compare
When fingerprinting large assets, hashing the asset can take quite a long time - over a second for a 300MB asset, for example. This can add up, particularly when generating multiple stacks in a single build, or when running test suites that bundle assets multiple times, and is not avoidable by asset caching (since it's computing the cache key). This change caches the result of digesting individual files based on the inode, mtime, and size of the input file. This feature improved the runtime of one of our slowest tests by ~10%. closes: aws#21297
4b8ecd9 to
fc2be6c
Compare
|
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
|
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
| mtime_unix: stats.mtime.getUTCDate(), | ||
| mtime_ms: stats.mtime.getUTCMilliseconds(), |
There was a problem hiding this comment.
This is not doing what is intended:
getUTCDate()returns the UTC day-of-the-month of the Date objectgetUTCMilliseconds()returns the UTC milliseconds component of the Date object (i.e: fractional seconds)
There was a problem hiding this comment.
Thanks for the catch (and fix!)
When fingerprinting large assets, hashing the asset can take quite a
long time - over a second for a 300MB asset, for example. This can add
up, particularly when generating multiple stacks in a single build, or
when running test suites that bundle assets multiple times, and is not
avoidable by asset caching (since it's computing the cache key).
This change caches the result of digesting individual files based on the
inode, mtime, and size of the input file.
This feature improved the runtime of one of our slowest tests by ~10%.
closes: #21297
Note: No README entries were added, because this sub-subsystem was already not documented in the README.
All Submissions:
New Features