[Remote Segment Store] Add RemoteDirectory interface to copy segment files to/from remote store#3102
Conversation
fa3a260 to
9a6812d
Compare
|
❌ Gradle Check failure fa3a260fcd8fe5c27a8227545fe743a3339b4210 |
|
❌ Gradle Check failure 9a6812d9447c580d2b11deda20db9a09614ad5d9 |
|
Checking the failures. |
9a6812d to
91ebb0f
Compare
|
❌ Gradle Check failure 91ebb0f57a39d402ef5957b61496503744e1e36a |
Bukhtawar
left a comment
There was a problem hiding this comment.
Thanks Sachin, this is a good start, few minor sugestions/clarifications
server/src/main/java/org/opensearch/index/store/RemoteDirectory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Do we need to acquire a lock before segment uploads or can we support concurrent uploads as well
There was a problem hiding this comment.
As long as segments in the remote store are uploaded and downloaded for backup and restore part, I don't think locking is required. Once we start writing to remote store via IndexWriter, locking would be required.
Open to suggestions.
There was a problem hiding this comment.
If we extend BaseDirectory, it will provide locking functionality with the provided LockFactory.
There was a problem hiding this comment.
Lets say if we go with concurrent writes to local and remote store would we still need a copyBytes or writeByte
There was a problem hiding this comment.
Do we also use ByteBuffer for leveraging I/O optimizations?
There was a problem hiding this comment.
Lets say if we go with concurrent writes to local and remote store would we still need a
copyBytesorwriteByte
For concurrent writes, we will need writeByte/s
Do we also use
ByteBufferfor leveraging I/O optimizations?
Right. Current implementation may not be optimized. It also depends on how a given repository implementation is handling writeBlob.
We will be having a continuous performance test setup which will provide insight into the performance issues.
There was a problem hiding this comment.
Another case, let's say it's not concurrent, but we want writes to only happen on remote which can be directly searchable we might want to just do writeBytes to remote directly
There was a problem hiding this comment.
Right, the idea is to open up the methods as per the requirement. As you said, writeByte/writeBytes methods will be required when IndexWriter is using RemoteDirectory to create the segments (concurrent writes or writing only to remote store).
There was a problem hiding this comment.
We also have a VerifyingIndexInput we might want to consider that handles the checksums
There was a problem hiding this comment.
Yes, either VerifyingIndexInput or ChecksumIndexInput can be used. I have added it as a ToDo item in the javadoc of the class.
There was a problem hiding this comment.
Issue is created for Checksum. Tagging this comment on the issue.
Signed-off-by: Sachin Kale <kalsac@amazon.com>
91ebb0f to
d11d190
Compare
|
You have failing tests here, do you still want to merge? (feature branch) |
This seem to be like a flaky test. There was a fix provided for it as a part of #2110. As this feature-branch was created before the fix, test is failing. Changes in this PR should not impact any existing tests. |
|
start gradle check |
|
start gradle check |
|
@sachinpkale @dblock I have rebased the feature branch from main and tried triggering a fresh build |
|
start gradle check |
2 similar comments
|
start gradle check |
|
start gradle check |
|
start gradle check try one more time. |
|
Checked the logs of last run. The failing test is: |
|
start gradle check |
|
Does this needs to be backported to |
@peterzhuamazon This would be a 3.0 change. |
…ore (opensearch-project#3102) Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com>
…ore (opensearch-project#3102) Signed-off-by: Sachin Kale <kalsac@amazon.com> Co-authored-by: Sachin Kale <kalsac@amazon.com>
Description
As described in the #2700, all the read/write operations to remote segment store will happen via
RemoteDirectoryinterface. This change adds the RemoteDirectory interface along with correspondingRemoteIndexInput(extends IndexInput) andRemoteIndexOutput(extends IndexOutput) classes.Issues Resolved
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.