Use dedicated cache keys instead of relying on an absolute path#51669
Use dedicated cache keys instead of relying on an absolute path#51669tlrx merged 3 commits intoelastic:feature/searchable-snapshotsfrom
Conversation
|
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore) |
DaveCTurner
left a comment
There was a problem hiding this comment.
Looks good, I left a few minor comments/questions.
| private final String fileName; | ||
| private final long fileLength; | ||
|
|
||
| CacheKey(SnapshotId snapshotId, IndexId indexId, ShardId shardId, Path cacheDir, String fileName, long fileLength) { |
There was a problem hiding this comment.
Could we have an EqualsHashCodeTestUtils#checkEqualsAndHashCode-based test for this?
| private final ShardId shardId; | ||
| private final Path cacheDir; | ||
| private final String fileName; | ||
| private final long fileLength; |
There was a problem hiding this comment.
It feels strange to have the length as part of the key. I kinda see that it's here to make some plumbing a bit simpler, but I think I'd still rather move it back out again.
There was a problem hiding this comment.
Now I look at it again I agree it feels strange. I reverted this.
| private final SnapshotId snapshotId; | ||
| private final IndexId indexId; | ||
| private final ShardId shardId; | ||
| private final Path cacheDir; |
There was a problem hiding this comment.
Similarly cacheDir. In my mind the thing in the cache is the Lucene file, which is uniquely identified by snapshot/index/shard/filename, and cacheDir is a bit of an implementation detail that I feel should be hidden here.
|
Thanks @DaveCTurner, I've applied your feedback. |
DaveCTurner
left a comment
There was a problem hiding this comment.
LGTM; I left one additional optional request.
| // directory has been closed. | ||
| cacheService.removeFromCache(key -> key.startsWith(cacheDir.toString())); | ||
| cacheService.removeFromCache(key -> | ||
| Objects.equals(key.getSnapshotId(), snapshotId) && |
There was a problem hiding this comment.
Maybe move this lambda to a method CacheKey#belongsTo(SnapshotId, IndexId, ShardId) (not sure about the name but that's the best I've got)
There was a problem hiding this comment.
Good idea. I kept the name and added a test in e8afbe7
|
Thanks David! |
Today cache files are identified in cache using a string representing an absolute path to a file on disk. This path is a sub directory of the current shard data path and as such already contains identification bits like the current index id and the shard id. It also contains the snapshot id that is passed at
CacheDirectorycreation time.While this has been done for quick prototyping and already been improved in #51520, it feels wrong to rely on a path converted to a string as cache keys. Instead we should have a distinct
CacheKeyobject to identifyCacheFilein cache.Relates #50693 (comment)