The problematic pattern
Some plugins utilize a WeakMap keyed by SourceCode object as a way to share data between different rules for the same file.
The pattern looks like this:
type FileData = /* whatever */;
const cache = new WeakMap<SourceCode, FileData>();
function getDataForFile(context: Context): FileData {
const cachedData = cache.get(context.sourceCode);
if (cachedData !== undefined) return cachedData;
const data = calculateData(context);
cache.set(context.sourceCode, data);
return data;
}
const rule1 = {
create(context: Context) {
const data = getDataForFile(context);
// ...
},
};
const rule2 = {
create(context: Context) {
const data = getDataForFile(context);
// ...
},
};
Why this is a problem in Oxlint
ESLint creates a fresh SourceCode class instance for each file. That same object is the value of context.sourceCode in every rule running on that file, but each file gets a new SourceCode object.
Oxlint, on the other hand, has one singleton SOURCE_CODE object, which is reused for every file. All properties of SOURCE_CODE are getters, which return the right data for the file currently being linted.
This breaks the above pattern. Once the WeakMap has been populated with an entry, cache.get(context.sourceCode) will return the same data for every file. The cache never gets cleared, and every file gets the data which was calculated for the 1st file.
This problem was seen previously in eslint-plugin-panda (#16867, #19986). Now discovered again in eslint-plugin-mocha (#20611).
GitHub search suggests that eslint-plugin-prettier and SonarJS plugin are also affected (though weirdly SonarJS's tests all pass in our conformance tests). However, that search likely hasn't unearthed all the plugins that use this pattern.
Motivations for Oxlint's current design
The motivations for having a singleton SOURCE_CODE object:
- Oxlint aggressively recycles objects wherever possible, to reduce GC pressure.
- Getters are on the
SOURCE_CODE object itself, not a class prototype, making (I believe) every property access slightly cheaper as JS engine doesn't have to walk up the prototype chain.
- Getters allow lazy computation. For example, the lines array is not built unless a rule accesses
sourceCode.lines.
- Works well with our "alternative API" where in
createOnce each rule also has the same Context object across all files.
Note that point 3 is somewhat orthogonal - we could still use lazy getters if context.sourceCode was an instance of a SourceCode class.
Why the WeakMap pattern is not ideal
WeakMap in JS is pretty slow (as is Map). It's also not the best data structure for this use case because ESLint (and also Oxlint) only lints 1 file at a time, so the WeakMap only has 1 active element at any time.
A simpler and more efficient pattern like this would work equally well:
let cachedData: FileData | null = null;
function getDataForFile(context: Context): FileData {
if (cachedData === null) cachedData = calculateData(context);
return cachedData;
}
// Called when a file is finished linting
function onFileFinish() {
cachedData = null;
}
The problem is that ESLint does not offer any API to be notified when a file finishes linting, so no way to call onFileFinish at the right time.
In the absence of this API, the WeakMap pattern is the simplest option.
Why other patterns don't work
Cache keyed by file path
let cachedData: FileData | null = null;
let currentFilename: string | null = null;
function getDataForFile(context: Context): FileData {
if (context.filename === currentFilename) return cachedData!;
currentFilename = context.filename;
cachedData = calculateData(context);
return cachedData;
}
This would work in ESLint CLI, but wouldn't work in language server, because the same file can be linted over and over in quick succession.
This addition makes it work with language server:
let cachedData: FileData | null = null;
let currentFilename: string | null = null;
let resetTimerSet: boolean = false;
function getDataForFile(context: Context): FileData {
if (context.filename === currentFilename) return cachedData!;
currentFilename = context.filename;
cachedData = calculateData(context);
if (!resetTimerSet) {
queueMicrotask(resetCache);
resetTimerSet = true;
}
return cachedData;
}
function resetCache() {
cachedData = null;
currentFilename = null;
resetTimerSet = false;
}
resetCache gets called in next micro-tick, and clears the cache before next file is linted, so the next lint run on same file gets a clean cache.
This depends on assumption that all language servers have a micro-tick between linting files. From my research, this appears to be a solid assumption, but it's still a little fragile.
But it still has a problem! There is one circumstance in which the same file gets linted multiple times in succession in a synchronous loop: When applying auto-fixes, ESLint will by default lint the same file up to 10 times in succession in a tight loop. There's no async break between passes, so resetCache does not get called in between. 2nd pass will get stale cache from the 1st pass, even though the code has changed.
Manually track when files finish being linted
const rule1 = createRule({
create(context: Context) {
const data = getDataForFile(context);
// ...
},
});
const rule2 = createRule({
create(context: Context) {
const data = getDataForFile(context);
// ...
},
});
function createRule(rule: Rule): Rule {
const originalCreate = rule.create;
rule.create = function (context: Context): Visitor {
lintingStarted();
const program = context.sourceCode.ast;
const visitor = originalCreate.call(this, context);
// Add listener for when AST traversal completes
const originalOnCodePathEnd = visitor.onCodePathEnd;
visitor.onCodePathEnd = originalOnCodePathEnd == null
? function (_codePath, node) {
if (node === program) lintingFinished();
}
: function (codePath, node) {
originalOnCodePathEnd.call(this, codePath, node);
if (node === program) lintingFinished();
};
return visitor;
};
return rule;
}
let rulesRunningCount = 0;
function lintingStarted() {
// Clear cache on next micro-tick
if (rulesRunningCount === 0) queueMicrotask(resetCache);
// Increment counter
rulesRunningCount++;
}
function lintingFinished() {
// Decrement counter
rulesRunningCount--;
// If counter reached 0, all rules have finished running on this file
if (rulesRunningCount === 0) resetCache();
}
function resetCache() {
cachedData = null;
rulesRunningCount = 0;
}
Notes:
onCodePathEnd event on Program is the last call to the visitor for each file. It fires after Program:exit visitor.
- Queuing
resetCache on next micro-tick is required to handle case where a rule throws an error during AST traversal, in which case onCodePathEnd will not fire. The fallback handles clearing the cache in this case in language server.
This works in ESLint CLI, in language server, and when linting a file multiple times in a synchronous loop when applying auto-fixes. It would also work in Oxlint.
However, it has a major downside in Oxlint: In Oxlint we use a fast AST visitor in common case where no rules register listeners for CFG events. If any rule does register a CFG event listener, we have to use a much slower walker which passes over the AST twice. Adding an onCodePathEnd listener to the visitor will cause a major slowdown in Oxlint.
It's also a lot of fiddly code to ask plugin authors to implement, though we could wrap it in an NPM package.
The problematic pattern
Some plugins utilize a
WeakMapkeyed bySourceCodeobject as a way to share data between different rules for the same file.The pattern looks like this:
Why this is a problem in Oxlint
ESLint creates a fresh
SourceCodeclass instance for each file. That same object is the value ofcontext.sourceCodein every rule running on that file, but each file gets a newSourceCodeobject.Oxlint, on the other hand, has one singleton
SOURCE_CODEobject, which is reused for every file. All properties ofSOURCE_CODEare getters, which return the right data for the file currently being linted.This breaks the above pattern. Once the
WeakMaphas been populated with an entry,cache.get(context.sourceCode)will return the same data for every file. The cache never gets cleared, and every file gets the data which was calculated for the 1st file.This problem was seen previously in
eslint-plugin-panda(#16867, #19986). Now discovered again ineslint-plugin-mocha(#20611).GitHub search suggests that
eslint-plugin-prettierand SonarJS plugin are also affected (though weirdly SonarJS's tests all pass in our conformance tests). However, that search likely hasn't unearthed all the plugins that use this pattern.Motivations for Oxlint's current design
The motivations for having a singleton
SOURCE_CODEobject:SOURCE_CODEobject itself, not a class prototype, making (I believe) every property access slightly cheaper as JS engine doesn't have to walk up the prototype chain.sourceCode.lines.createOnceeach rule also has the sameContextobject across all files.Note that point 3 is somewhat orthogonal - we could still use lazy getters if
context.sourceCodewas an instance of aSourceCodeclass.Why the
WeakMappattern is not idealWeakMapin JS is pretty slow (as isMap). It's also not the best data structure for this use case because ESLint (and also Oxlint) only lints 1 file at a time, so theWeakMaponly has 1 active element at any time.A simpler and more efficient pattern like this would work equally well:
The problem is that ESLint does not offer any API to be notified when a file finishes linting, so no way to call
onFileFinishat the right time.In the absence of this API, the
WeakMappattern is the simplest option.Why other patterns don't work
Cache keyed by file path
This would work in ESLint CLI, but wouldn't work in language server, because the same file can be linted over and over in quick succession.
This addition makes it work with language server:
resetCachegets called in next micro-tick, and clears the cache before next file is linted, so the next lint run on same file gets a clean cache.This depends on assumption that all language servers have a micro-tick between linting files. From my research, this appears to be a solid assumption, but it's still a little fragile.
But it still has a problem! There is one circumstance in which the same file gets linted multiple times in succession in a synchronous loop: When applying auto-fixes, ESLint will by default lint the same file up to 10 times in succession in a tight loop. There's no async break between passes, so
resetCachedoes not get called in between. 2nd pass will get stale cache from the 1st pass, even though the code has changed.Manually track when files finish being linted
Notes:
onCodePathEndevent onProgramis the last call to the visitor for each file. It fires afterProgram:exitvisitor.resetCacheon next micro-tick is required to handle case where a rule throws an error during AST traversal, in which caseonCodePathEndwill not fire. The fallback handles clearing the cache in this case in language server.This works in ESLint CLI, in language server, and when linting a file multiple times in a synchronous loop when applying auto-fixes. It would also work in Oxlint.
However, it has a major downside in Oxlint: In Oxlint we use a fast AST visitor in common case where no rules register listeners for CFG events. If any rule does register a CFG event listener, we have to use a much slower walker which passes over the AST twice. Adding an
onCodePathEndlistener to the visitor will cause a major slowdown in Oxlint.It's also a lot of fiddly code to ask plugin authors to implement, though we could wrap it in an NPM package.