Skip to content

Conversation

@rutkowskib
Copy link
Contributor

@rutkowskib rutkowskib commented Jul 25, 2025

Cache package.json location to improve performance of migrations with a lot of files

Closes: #4136

Description of change

Currently for every file imported from the same directory code tries to find package.json. This makes migrations with a lot of files very slow because for every a lot of operations need to be performed despite all of the files potentially being in the same directory.
This update makes sure that the time to start migrations stays the same no matter how many migrations we have. I have added a cache that store paths that we know where package.json exists.
I have added a test that verifies that verifies that cache is correctly used to verify working of this change.

Pull-Request Checklist

  • Code is up-to-date with the master branch
  • This pull request links relevant issues as Fixes #4136
  • There are new or updated unit tests validating the change
  • Documentation has been updated to reflect this change - N/A

Summary by CodeRabbit

  • Performance Improvements

    • Added a bounded cache for nearest package.json lookups to reduce redundant filesystem reads and JSON parsing, improving repeated-lookup performance while keeping memory usage controlled.
  • Tests

    • Added unit tests verifying the cache prevents unnecessary filesystem operations across repeated imports and confirms correct cache behavior during multiple lookups.

Cache package.json location to improve performance of migrations with a lot of files

Closes: typeorm#4136
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 25, 2025

Walkthrough

Adds a bounded in-memory cache to getNearestPackageJson to reuse parsed package.json results across directory traversals and a unit test verifying subsequent imports avoid redundant fs.promises.stat/readFile calls.

Changes

Cohort / File(s) Change Summary
Import utils
src/util/ImportUtils.ts
Added packageJsonCache Map and MAX_CACHE_SIZE (1000); added setPackageJsonCache helper; integrated cache checks and caching of parsed package.json or null for traversed directories; simple eviction of oldest entry when capacity exceeded.
Unit tests
test/unit/util/import-utils.ts
Added test "Should use cache to find package.json" which spies on fs.promises.stat and fs.promises.readFile using sinon, creates temporary files to populate cache, asserts subsequent imports do not increase stat/readFile call counts, and cleans up files and spies.

Sequence Diagram(s)

sequenceDiagram
    participant ImportOrRequire as importOrRequireFile
    participant ImportUtils as getNearestPackageJson
    participant Cache as packageJsonCache
    participant FS as fs.promises

    ImportOrRequire->>ImportUtils: request nearest package.json for dir
    ImportUtils->>Cache: check cache for dir
    alt cache hit
        Cache-->>ImportUtils: cached package.json or null
        ImportUtils->>Cache: cache result for traversed dirs
        ImportUtils-->>ImportOrRequire: return result
    else cache miss
        ImportUtils->>FS: stat(package.json) / readFile(...)
        alt package.json found & parsed
            FS-->>ImportUtils: file contents
            ImportUtils->>Cache: cache parsed result for traversed dirs
            ImportUtils-->>ImportOrRequire: return parsed package.json
        else not found or parse error
            ImportUtils->>Cache: cache null for traversed dirs
            ImportUtils-->>ImportOrRequire: return null
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Assessment against linked issues

Objective Addressed Explanation
Reduce migration runtime overhead caused by repeated filesystem/parse work when scanning many files (issue #4136)

"I'm a rabbit in the code, I cache with delight,
Hopping past package.json to save every byte.
Fewer stats, fewer reads, a quicker sprint too,
I tuck results away — so your migrations fly through. 🐇"

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 76b1363 and ba2b03e.

📒 Files selected for processing (1)
  • test/unit/util/import-utils.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/unit/util/import-utils.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: tests-linux (20) / sqlite
  • GitHub Check: tests-linux (20) / mysql_mariadb
  • GitHub Check: tests-linux (18) / sap
  • GitHub Check: tests-linux (20) / sap
  • GitHub Check: tests-linux (20) / mssql
  • GitHub Check: tests-linux (18) / oracle
  • GitHub Check: tests-linux (20) / cockroachdb
  • GitHub Check: tests-linux (18) / mysql_mariadb
  • GitHub Check: tests-linux (18) / mssql
  • GitHub Check: tests-linux (18) / postgres (17-3.5)
  • GitHub Check: tests-linux (18) / postgres (14-3.5)
  • GitHub Check: tests-windows / sqljs
  • GitHub Check: tests-windows / sqlite
  • GitHub Check: tests-windows / better-sqlite3
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jul 25, 2025

typeorm-sql-js-example

npm i https://pkg.pr.new/typeorm/typeorm@11580

commit: 19002e7

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
src/util/ImportUtils.ts (1)

76-81: Consider adding error handling for JSON parsing edge cases.

While the try-catch handles parsing errors, consider logging or providing more context about parsing failures for debugging purposes.

            try {
                const parsedPackage = JSON.parse(
                    await fs.readFile(potentialPackageJson, "utf8"),
                )
                // Cache the parsed package.json object and return it
                setPackageJsonCache(paths, parsedPackage)
                return parsedPackage
            } catch (error) {
+                // Log parsing errors for debugging (consider using a proper logger)
+                console.warn(`Failed to parse package.json at ${potentialPackageJson}:`, error.message)
                // If parsing fails, we still cache null to avoid repeated attempts
                setPackageJsonCache(paths, null)
                return null
            }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d57fe3b and 6022bca.

📒 Files selected for processing (2)
  • src/util/ImportUtils.ts (3 hunks)
  • test/github-issues/4136/issue-4136.ts (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
test/github-issues/4136/issue-4136.ts (1)
src/util/ImportUtils.ts (1)
  • importOrRequireFile (5-41)
🔇 Additional comments (4)
src/util/ImportUtils.ts (3)

45-49: LGTM! Clean helper function implementation.

The helper function correctly updates the cache for multiple paths efficiently.


53-53: Good approach to track all traversed paths for batch caching.

This ensures that all directories in the traversal path benefit from the cache, not just the final result directory.


83-85: Good strategy to cache null results to prevent repeated failed attempts.

This prevents redundant filesystem operations for directories that don't contain valid package.json files.

test/github-issues/4136/issue-4136.ts (1)

11-18: LGTM! Proper test setup and teardown.

Good use of Sinon stubs with proper cleanup in afterEach.

Comment on lines 20 to 57
it("Should use cache to find package.json", async () => {
assert.equal(
statStub.callCount,
0,
"stat should not be called before importOrRequireFile",
)
assert.equal(
readFileStub.callCount,
0,
"readFile should not be called before importOrRequireFile",
)

const filePath1 = path.join(__dirname, "file1.js")
const filePath2 = path.join(__dirname, "file2.js")
const filePath3 = path.join(__dirname, "file3.js")

// Trigger the first import to create the cache
await importOrRequireFile(filePath1)

// Get the number of calls to stat and readFile after the first import
const numberOfStatCalls = statStub.callCount
const numberOfReadFileCalls = readFileStub.callCount

// Trigger next imports to check if cache is used
await importOrRequireFile(filePath2)
await importOrRequireFile(filePath3)

assert.equal(
statStub.callCount,
numberOfStatCalls,
"stat should be called only during the first import",
)
assert.equal(
readFileStub.callCount,
numberOfReadFileCalls,
"readFile should be called only during the first import",
)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance test robustness by configuring stub behavior and testing edge cases.

The test doesn't configure stub behavior, which might cause the actual importOrRequireFile to fail. Consider adding stub implementations and testing additional scenarios.

it("Should use cache to find package.json", async () => {
+    // Configure stub behavior to simulate successful package.json detection
+    statStub.callsFake(async (filePath) => ({
+        isFile: () => filePath.endsWith('package.json')
+    }))
+    readFileStub.callsFake(async (filePath) => '{"type": "commonjs"}')
+    
    assert.equal(
        statStub.callCount,
        0,
        "stat should not be called before importOrRequireFile",
    )
    // ... rest of test
})

+it("Should handle cache misses correctly", async () => {
+    // Test behavior when package.json doesn't exist
+    statStub.rejects(new Error('ENOENT'))
+    
+    const filePath = path.join(__dirname, "nonexistent", "file.js")
+    await importOrRequireFile(filePath)
+    
+    // Verify null is cached for failed lookups
+    const initialCalls = statStub.callCount
+    await importOrRequireFile(filePath)
+    assert.equal(statStub.callCount, initialCalls)
+})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
it("Should use cache to find package.json", async () => {
assert.equal(
statStub.callCount,
0,
"stat should not be called before importOrRequireFile",
)
assert.equal(
readFileStub.callCount,
0,
"readFile should not be called before importOrRequireFile",
)
const filePath1 = path.join(__dirname, "file1.js")
const filePath2 = path.join(__dirname, "file2.js")
const filePath3 = path.join(__dirname, "file3.js")
// Trigger the first import to create the cache
await importOrRequireFile(filePath1)
// Get the number of calls to stat and readFile after the first import
const numberOfStatCalls = statStub.callCount
const numberOfReadFileCalls = readFileStub.callCount
// Trigger next imports to check if cache is used
await importOrRequireFile(filePath2)
await importOrRequireFile(filePath3)
assert.equal(
statStub.callCount,
numberOfStatCalls,
"stat should be called only during the first import",
)
assert.equal(
readFileStub.callCount,
numberOfReadFileCalls,
"readFile should be called only during the first import",
)
})
it("Should use cache to find package.json", async () => {
// Configure stub behavior to simulate successful package.json detection
statStub.callsFake(async (filePath) => ({
isFile: () => filePath.endsWith('package.json')
}))
readFileStub.callsFake(async (filePath) => '{"type": "commonjs"}')
assert.equal(
statStub.callCount,
0,
"stat should not be called before importOrRequireFile",
)
assert.equal(
readFileStub.callCount,
0,
"readFile should not be called before importOrRequireFile",
)
const filePath1 = path.join(__dirname, "file1.js")
const filePath2 = path.join(__dirname, "file2.js")
const filePath3 = path.join(__dirname, "file3.js")
// Trigger the first import to create the cache
await importOrRequireFile(filePath1)
// Get the number of calls to stat and readFile after the first import
const numberOfStatCalls = statStub.callCount
const numberOfReadFileCalls = readFileStub.callCount
// Trigger next imports to check if cache is used
await importOrRequireFile(filePath2)
await importOrRequireFile(filePath3)
assert.equal(
statStub.callCount,
numberOfStatCalls,
"stat should be called only during the first import",
)
assert.equal(
readFileStub.callCount,
numberOfReadFileCalls,
"readFile should be called only during the first import",
)
})
it("Should handle cache misses correctly", async () => {
// Test behavior when package.json doesn't exist
statStub.rejects(new Error('ENOENT'))
const filePath = path.join(__dirname, "nonexistent", "file.js")
await importOrRequireFile(filePath)
// Verify null is cached for failed lookups
const initialCalls = statStub.callCount
await importOrRequireFile(filePath)
assert.equal(statStub.callCount, initialCalls)
})
🤖 Prompt for AI Agents
In test/github-issues/4136/issue-4136.ts around lines 20 to 57, the test lacks
stub behavior configuration, risking failures when calling importOrRequireFile.
Fix this by setting up appropriate stub implementations for stat and readFile to
simulate expected file system responses. Additionally, extend the test to cover
edge cases such as missing files or read errors to improve robustness.

Comment on lines 32 to 34
const filePath1 = path.join(__dirname, "file1.js")
const filePath2 = path.join(__dirname, "file2.js")
const filePath3 = path.join(__dirname, "file3.js")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider testing files from different directories to verify caching behavior.

All test files are from the same directory (__dirname). Consider adding test cases with files from different directories to ensure the cache works correctly across directory boundaries.

const filePath1 = path.join(__dirname, "file1.js")
const filePath2 = path.join(__dirname, "file2.js")
const filePath3 = path.join(__dirname, "file3.js")
+
+// Test files from different directories
+const filePath4 = path.join(__dirname, "subdir", "file4.js")
+const filePath5 = path.join(path.dirname(__dirname), "file5.js")

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In test/github-issues/4136/issue-4136.ts around lines 32 to 34, the test files
are all located in the same directory (__dirname), which limits coverage of
caching behavior across directories. Add additional test files located in
different directories by creating or referencing files in separate folder paths
and include them in the test cases to verify that caching works correctly across
directory boundaries.

@coveralls
Copy link

coveralls commented Jul 25, 2025

Coverage Status

coverage: 76.349% (-0.09%) from 76.44%
when pulling 19002e7 on rutkowskib:perf-4136
into d57fe3b on typeorm:master.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing tests for ImportUtils are placed in test/unit/util/import-utils.ts.

Would be great to have them there, since we are trying to move away from per-issue tests (which are executed at least once for every database driver).

If you consider that multiple files are better, you can create the import-utils folder and move the existing tests there, together with this file (with appropriate names/test descriptions, of course)

Move tests and use Map instead of object as per review comments
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
test/unit/util/import-utils.ts (2)

4-6: Consider more direct import for fs.promises.

The imports look good for the test requirements. However, the fsAsync import could be more specific since you're only using fsAsync.promises.

Consider this more direct approach:

-import fsAsync from "fs"
+import { promises as fsPromises } from "fs"

Then update the stub calls accordingly:

-const statStub = sinon.stub(fsAsync.promises, "stat")
-const readFileStub = sinon.stub(fsAsync.promises, "readFile")
+const statStub = sinon.stub(fsPromises, "stat")
+const readFileStub = sinon.stub(fsPromises, "readFile")

214-227: Consider verifying functionality alongside cache behavior.

The test effectively verifies that caching prevents redundant filesystem operations, but it doesn't confirm that the cached results are functionally correct.

Consider adding a functional verification after the cache assertions:

 assert.equal(
     readFileStub.callCount,
     numberOfReadFileCalls,
     "readFile should be called only during the first import",
 )
+
+// Verify that caching doesn't break functionality
+const [exports1, moduleType1] = await importOrRequireFile(filePath1)
+const [exports2, moduleType2] = await importOrRequireFile(filePath2)
+const [exports3, moduleType3] = await importOrRequireFile(filePath3)
+
+// All should have the same module type since they're in the same directory
+assert.equal(moduleType1, moduleType2, "Module types should be consistent")
+assert.equal(moduleType2, moduleType3, "Module types should be consistent")

This ensures the cache doesn't just reduce calls but also maintains correct behavior.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6022bca and 6f458f9.

📒 Files selected for processing (2)
  • src/util/ImportUtils.ts (3 hunks)
  • test/unit/util/import-utils.ts (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/util/ImportUtils.ts
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: alumni
PR: typeorm/typeorm#11580
File: src/util/ImportUtils.ts:58-62
Timestamp: 2025-07-26T12:22:52.215Z
Learning: In TypeORM's codebase, `importOrRequireFile` is always called with resolved file paths. For example, `DirectoryExportedClassesLoader.ts` calls `PlatformTools.pathResolve(file)` before passing the path to `importOrRequireFile`, and `PlatformTools.pathResolve()` is a wrapper around Node.js `path.resolve()`. This means the cache keys in `getNearestPackageJson` will always be absolute paths, eliminating concerns about cache consistency between relative and absolute path formats.
test/unit/util/import-utils.ts (1)

Learnt from: alumni
PR: #11580
File: src/util/ImportUtils.ts:58-62
Timestamp: 2025-07-26T12:22:52.215Z
Learning: In TypeORM's codebase, importOrRequireFile is always called with resolved file paths. For example, DirectoryExportedClassesLoader.ts calls PlatformTools.pathResolve(file) before passing the path to importOrRequireFile, and PlatformTools.pathResolve() is a wrapper around Node.js path.resolve(). This means the cache keys in getNearestPackageJson will always be absolute paths, eliminating concerns about cache consistency between relative and absolute path formats.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: tests-linux (20) / mongodb
  • GitHub Check: tests-linux (20) / postgres (14-3.5)
  • GitHub Check: tests-linux (20) / sqlite
  • GitHub Check: tests-linux (20) / oracle
  • GitHub Check: tests-linux (20) / better-sqlite3
  • GitHub Check: tests-linux (20) / postgres (17-3.5)
  • GitHub Check: tests-linux (18) / sap
  • GitHub Check: tests-linux (18) / postgres (17-3.5)
  • GitHub Check: tests-linux (18) / postgres (14-3.5)
  • GitHub Check: tests-linux (18) / oracle
  • GitHub Check: tests-linux (18) / mysql_mariadb
  • GitHub Check: tests-linux (18) / mongodb
  • GitHub Check: tests-linux (18) / mssql
  • GitHub Check: tests-linux (18) / better-sqlite3
  • GitHub Check: tests-linux (18) / sqlite
  • GitHub Check: tests-windows / better-sqlite3
  • GitHub Check: tests-windows / sqljs
  • GitHub Check: tests-windows / sqlite
  • GitHub Check: Analyze (javascript-typescript)

@rutkowskib rutkowskib requested a review from alumni July 28, 2025 11:54
await importOrRequireFile(filePath1)

// Get the number of calls to stat and readFile after the first import
const numberOfStatCalls = statStub.callCount
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add a check of the actual number of calls to see that the call is actually happening before caching. e.g. not store a statStub.callCount into a variable but use a constant number instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case to see if it (the file read) is working at all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that work for you if I added a check that numberOfStatCalls isnt a 0 instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be checked against a constant value. Can you check what number is returned by statStub.callCount? Enter that value into the assertion instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are now failing. I think it is because they are built differently on my local machine then in CI. What is your suggestion? Shall I create package.json in the test? Shall I revert?

Copy link
Contributor Author

@rutkowskib rutkowskib Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will break running tests on local though

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, revert the change, but leave a comment so we will look at it later

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I didn't think of such workaround, but I guess it's better than leaving it unknown.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case if you can't fix it, you can revert to the point it was working on both GitHub Actions and locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed them, you can re review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
test/unit/util/import-utils.ts (1)

184-265: Make the cache test robust: isolate filesystem, drop hard-coded counts, and guarantee cleanup

  • Writing package.json and test files into __dirname risks cross-test interference and leaves residue on failures.
  • Hard-coding first-call counts (stat === 1, readFile === 1 || 2) is brittle across OS/Node versions and CI vs. local.
  • Ensure teardown with try/finally, and restore Sinon even on failure.

This refactor isolates the test into a dedicated temp dir, asserts that first import hits the filesystem (without hard-coding exact counts), verifies no further calls on subsequent imports, and guarantees cleanup.

Apply this diff:

@@
-    it("Should use cache to find package.json", async () => {
-        // Create package.json if not exists
-        const packageJsonPath = path.join(__dirname, "package.json")
-        const packageJsonAlreadyExisted = fsAsync.existsSync(packageJsonPath)
-        if (!packageJsonAlreadyExisted) {
-            await fs.writeFile(
-                packageJsonPath,
-                JSON.stringify({ type: "module" }),
-                "utf8",
-            )
-        }
-
-        const statSpy = sinon.spy(fsAsync.promises, "stat")
-        const readFileSpy = sinon.spy(fsAsync.promises, "readFile")
-
-        assert.equal(
-            statSpy.callCount,
-            0,
-            "stat should not be called before importOrRequireFile",
-        )
-        assert.equal(
-            readFileSpy.callCount,
-            0,
-            "readFile should not be called before importOrRequireFile",
-        )
-
-        const filePath1 = path.join(__dirname, "file1.js")
-        const filePath2 = path.join(__dirname, "file2.js")
-        const filePath3 = path.join(__dirname, "file3.js")
-
-        await fs.writeFile(filePath1, "", "utf8")
-        await fs.writeFile(filePath2, "", "utf8")
-        await fs.writeFile(filePath3, "", "utf8")
-
-        // Trigger the first import to create the cache
-        await importOrRequireFile(filePath1)
-
-        // Get the number of calls to stat and readFile after the first import
-        const numberOfStatCalls = statSpy.callCount
-        const numberOfReadFileCalls: number = readFileSpy.callCount
-
-        assert.equal(
-            numberOfStatCalls,
-            1,
-            "stat should be called for the first import",
-        )
-
-        const isCorrectNumberOfReadFileCalls =
-            numberOfReadFileCalls === 1 || numberOfReadFileCalls === 2 // Depending on platform, it might call readFile once or twice
-        assert.equal(
-            isCorrectNumberOfReadFileCalls,
-            true,
-            "readFile should be called for the first import",
-        )
-
-        // Trigger next imports to check if cache is used
-        await importOrRequireFile(filePath2)
-        await importOrRequireFile(filePath3)
-
-        assert.equal(
-            statSpy.callCount,
-            numberOfStatCalls,
-            "stat should be called only during the first import",
-        )
-        assert.equal(
-            readFileSpy.callCount,
-            numberOfReadFileCalls,
-            "readFile should be called only during the first import",
-        )
-
-        // Clean up test files
-        await fs.unlink(filePath1)
-        await fs.unlink(filePath2)
-        await fs.unlink(filePath3)
-
-        // If package.json was created by this test, remove it
-        if (!packageJsonAlreadyExisted) {
-            await fs.unlink(packageJsonPath)
-        }
-
-        sinon.restore()
-    })
+    it("Should use cache to find package.json", async () => {
+        // Use an isolated directory to avoid interfering with other tests
+        const testDir = path.join(__dirname, "testCacheValidation")
+        await fs.rm(testDir, { recursive: true, force: true })
+        await fs.mkdir(testDir, { recursive: true })
+
+        // Create a dedicated package.json used only by this test
+        const packageJsonPath = path.join(testDir, "package.json")
+        await fs.writeFile(
+            packageJsonPath,
+            JSON.stringify({ type: "module" }),
+            "utf8",
+        )
+
+        // Spy on fs calls that getNearestPackageJson performs
+        const statSpy = sinon.spy(fsAsync.promises, "stat")
+        const readFileSpy = sinon.spy(fsAsync.promises, "readFile")
+
+        try {
+            assert.equal(
+                statSpy.callCount,
+                0,
+                "stat should not be called before importOrRequireFile",
+            )
+            assert.equal(
+                readFileSpy.callCount,
+                0,
+                "readFile should not be called before importOrRequireFile",
+            )
+
+            const filePath1 = path.join(testDir, "file1.js")
+            const filePath2 = path.join(testDir, "file2.js")
+            const filePath3 = path.join(testDir, "file3.js")
+
+            await fs.writeFile(filePath1, "", "utf8")
+            await fs.writeFile(filePath2, "", "utf8")
+            await fs.writeFile(filePath3, "", "utf8")
+
+            // Trigger the first import to create/populate the cache
+            await importOrRequireFile(filePath1)
+
+            // Record counts after the first import
+            const numberOfStatCalls = statSpy.callCount
+            const numberOfReadFileCalls = readFileSpy.callCount
+
+            // Only assert that we touched the filesystem at least once (no hard-coded counts)
+            assert.ok(
+                numberOfStatCalls > 0,
+                "stat should be called during the first import",
+            )
+            assert.ok(
+                numberOfReadFileCalls > 0,
+                "readFile should be called during the first import",
+            )
+
+            // Subsequent imports should hit the cache (no extra fs calls)
+            await importOrRequireFile(filePath2)
+            await importOrRequireFile(filePath3)
+
+            assert.equal(
+                statSpy.callCount,
+                numberOfStatCalls,
+                "stat should not be called after the first import (cache hit)",
+            )
+            assert.equal(
+                readFileSpy.callCount,
+                numberOfReadFileCalls,
+                "readFile should not be called after the first import (cache hit)",
+            )
+        } finally {
+            sinon.restore()
+            await fs.rm(testDir, { recursive: true, force: true })
+        }
+    })
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a8649c5 and 76b1363.

📒 Files selected for processing (1)
  • test/unit/util/import-utils.ts (2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: alumni
PR: typeorm/typeorm#11580
File: src/util/ImportUtils.ts:58-62
Timestamp: 2025-07-26T12:22:52.215Z
Learning: In TypeORM's codebase, `importOrRequireFile` is always called with resolved file paths. For example, `DirectoryExportedClassesLoader.ts` calls `PlatformTools.pathResolve(file)` before passing the path to `importOrRequireFile`, and `PlatformTools.pathResolve()` is a wrapper around Node.js `path.resolve()`. This means the cache keys in `getNearestPackageJson` will always be absolute paths, eliminating concerns about cache consistency between relative and absolute path formats.
📚 Learning: 2025-07-26T12:22:52.215Z
Learnt from: alumni
PR: typeorm/typeorm#11580
File: src/util/ImportUtils.ts:58-62
Timestamp: 2025-07-26T12:22:52.215Z
Learning: In TypeORM's codebase, `importOrRequireFile` is always called with resolved file paths. For example, `DirectoryExportedClassesLoader.ts` calls `PlatformTools.pathResolve(file)` before passing the path to `importOrRequireFile`, and `PlatformTools.pathResolve()` is a wrapper around Node.js `path.resolve()`. This means the cache keys in `getNearestPackageJson` will always be absolute paths, eliminating concerns about cache consistency between relative and absolute path formats.

Applied to files:

  • test/unit/util/import-utils.ts
🧬 Code Graph Analysis (1)
test/unit/util/import-utils.ts (1)
src/util/ImportUtils.ts (1)
  • importOrRequireFile (5-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: tests-linux (20) / mongodb
  • GitHub Check: tests-linux (20) / sqlite
  • GitHub Check: tests-linux (18) / mysql_mariadb
  • GitHub Check: tests-linux (18) / postgres (17-3.5)
  • GitHub Check: tests-linux (18) / postgres (14-3.5)
  • GitHub Check: tests-linux (18) / oracle
  • GitHub Check: tests-linux (18) / mssql
  • GitHub Check: tests-linux (18) / sqlite
  • GitHub Check: tests-linux (18) / better-sqlite3
  • GitHub Check: tests-linux (18) / sqljs
  • GitHub Check: tests-windows / sqlite
  • GitHub Check: tests-linux (18) / mongodb
  • GitHub Check: tests-windows / sqljs
  • GitHub Check: tests-windows / better-sqlite3
  • GitHub Check: Analyze (javascript-typescript)

Comment on lines +4 to +6
import { strict as assert } from "assert"
import sinon from "sinon"
import fsAsync from "fs"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Ensure spies intercept the actual fs methods used by getNearestPackageJson

You're spying on fsAsync.promises.stat/readFile (from fs). If getNearestPackageJson uses fs/promises, those calls might not always be captured across Node versions. This can make the test flaky or silently invalid.

  • Keep spying on fsAsync.promises to avoid counting your own fs.writeFile calls.
  • Additionally verify which fs variant is imported in src/util/ImportUtils.ts and confirm your spies actually observe those calls.

Run this to confirm which module is used inside ImportUtils:

If it uses fs/promises, consider also spying on that module to harden the test without counting writes:

-const statSpy = sinon.spy(fsAsync.promises, "stat")
-const readFileSpy = sinon.spy(fsAsync.promises, "readFile")
+const statSpy = sinon.spy(fsAsync.promises, "stat")
+const readFileSpy = sinon.spy(fsAsync.promises, "readFile")
+// Also capture calls made via `fs/promises` if ImportUtils uses it:
+// (won't count your writes since you only spy on readFile/stat)
+const statSpyFsPromises = sinon.spy(fs, "stat")
+const readFileSpyFsPromises = sinon.spy(fs, "readFile")

🏁 Script executed:

#!/bin/bash
# Show how fs is imported in ImportUtils.ts
rg -n -C2 -P 'from\s+"fs(/promises)?"' src/util/ImportUtils.ts

Length of output: 159


Align spies with fs/promises import

In test/unit/util/import-utils.ts, ImportUtils uses import fs from "fs/promises", but the test currently imports the callback API (fs) and spies on fsAsync.promises. Update the test to import and spy on the promises API directly:

• File: test/unit/util/import-utils.ts

  • Change the import:
    - import fsAsync from "fs"
    + import fsAsync from "fs/promises"
  • Update the spies:
    - const statSpy = sinon.spy(fsAsync.promises, "stat")
    - const readFileSpy = sinon.spy(fsAsync.promises, "readFile")
    + const statSpy = sinon.spy(fsAsync, "stat")
    + const readFileSpy = sinon.spy(fsAsync, "readFile")

This guarantees your spies intercept the exact stat/readFile calls from the same module used by getNearestPackageJson without counting any other API usage.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import { strict as assert } from "assert"
import sinon from "sinon"
import fsAsync from "fs"
// At the top of test/unit/util/import-utils.ts
import { strict as assert } from "assert"
import sinon from "sinon"
import fsAsync from "fs/promises"
// … later in the test where you set up your spies:
const statSpy = sinon.spy(fsAsync, "stat")
const readFileSpy = sinon.spy(fsAsync, "readFile")
🤖 Prompt for AI Agents
In test/unit/util/import-utils.ts around lines 4 to 6, the test imports the
callback API (fs) and spies on fsAsync.promises, but the code under test imports
fs/promises, so update the test to import the promises API directly (e.g. import
fsPromises from "fs/promises") and change any spies/stubs to target
fsPromises.stat and fsPromises.readFile (or use sinon.stub(fsPromises, "stat") /
sinon.stub(fsPromises, "readFile")) instead of spying on fsAsync.promises;
ensure you restore those stubs after the test.

@rutkowskib rutkowskib requested a review from OSA413 August 16, 2025 23:21
it("Should use cache to find package.json", async () => {
// Create package.json if not exists
const packageJsonPath = path.join(__dirname, "package.json")
const packageJsonAlreadyExisted = fsAsync.existsSync(packageJsonPath)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, can you describe why sometimes the package.json is created and sometimes it's not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In CI it does exist in the folder that test is run from. On local it doesnt. I dont fully know why it's like that, you would have to ask somebody who knows more about how tests are run in CI.

@alumni alumni merged commit b6ffd46 into typeorm:master Aug 18, 2025
58 checks passed
ThbltLmr pushed a commit to ThbltLmr/typeorm that referenced this pull request Dec 2, 2025
…ations (typeorm#11580)

* perf: Cache package.json location between invocation

Cache package.json location to improve performance of migrations with a lot of files

Closes: typeorm#4136

* refactor: Use map. Move tests to appropriate files

Move tests and use Map instead of object as per review comments

* test: Check number of invocations in test

Change test to assert number of stat and readFile calls

* test: Change assert for CI

Added assertion to make both local and CI work

* Create package.json in test

* Create file only if not existed before

* test: Fix test assertion based on platform

* test: Change package.json type

* ci: Trigger tests

---------

Co-authored-by: Bartlomiej Rutkowski <brutkowski@tilt.app>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migration performance correlated to number of migration files

7 participants