Add cache directory low-level instrumentation #51637
Add cache directory low-level instrumentation #51637tlrx merged 8 commits intoelastic:feature/searchable-snapshotsfrom
Conversation
|
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore) |
DaveCTurner
left a comment
There was a problem hiding this comment.
Looks good. I like the simplicity of tracking these stats directly within the CacheDirectory. I left a few comments and thoughts.
...apshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/IndexInputStats.java
Outdated
Show resolved
Hide resolved
...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java
Show resolved
Hide resolved
...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java
Show resolved
Hide resolved
...apshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/IndexInputStats.java
Outdated
Show resolved
Hide resolved
|
Thanks @DaveCTurner, I've applied your feedback and pushed some changes. I'm sure you have more comments, can you please have another look? Thanks! |
| final String fileName = cacheFileReference.getFileName(); | ||
| final byte[] copyBuffer = new byte[Math.toIntExact(Math.min(COPY_BUFFER_SIZE, end - start))]; | ||
| try (IndexInput input = in.openInput(cacheFileReference.getFileName(), ioContext)) { | ||
| logger.trace(() -> new ParameterizedMessage("writing range [{}-{}] of file [{}] to cache file", start, end, fileName)); |
There was a problem hiding this comment.
Ah, I think this will not include the shard ID so we might struggle to interpret the logs when there are lots of shards to search. I think logging cacheFileReference itself gives us everything we need.
...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java
Show resolved
Hide resolved
DaveCTurner
left a comment
There was a problem hiding this comment.
One request for a bit more logging detail, otherwise LGTM.
| private int readDirectly(long start, long end, byte[] buffer, int offset) throws IOException { | ||
| final String fileName = cacheFileReference.getFileName(); | ||
| final byte[] copyBuffer = new byte[Math.toIntExact(Math.min(COPY_BUFFER_SIZE, end - start))]; | ||
| logger.trace(() -> new ParameterizedMessage("direct reading of range [{}-{}] from file [{}]", start, end, fileName)); |
|
@elasticmachine run elasticsearch-ci/2 A CCR test failed previously with what seems a connection issue |
|
Thanks David! |
This commit builds on elastic#51637, adding tracking of the total time spent fetching data from the blob store and a linear regression model for these fetches.
This commit adds a REST API that exposes the various CacheDirectory stats added in #51637. It adds the necessary action, transport action and request and response objects as well as a new qa:rest project for REST tests. The REST endpoint is _searchable_snapshots/stats and can be filtered by index names.
This pull request is a first step to add instrumentation to the
CacheDirectoryadded in #50693.It adds a new mutable
IndexInputStatsobject that allows to track various information about howCacheBufferedIndexInputinteracts with the underlying cache file to satisfy the read operations. It keep tracks of small/large forward/backward seekings as well as the total number of bytes read from theIndexInputand the total number of bytes read/written from/to theCacheFile.Note that the stats do not reflect the exact usage that Lucene does of the
IndexInputs opened through aCacheDirectory:IndexInputStatsis not aware of the read operations that are directly served at a higher level by the internalBufferedIndexInput's buffer. Instead it tracks what really hit the disk which is, I think, what is the most interesting for us at this stage.This pull request does not expose the information through a REST API, this will be done in a follow up pull request once the low level instrumentation is validated.