engine: Support reading objects without decoding#2753
Conversation
0ceb48b to
8654d4f
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2753 +/- ##
==========================================
+ Coverage 21.89% 22.03% +0.14%
==========================================
Files 787 787
Lines 46643 46763 +120
==========================================
+ Hits 10211 10304 +93
- Misses 35555 35576 +21
- Partials 877 883 +6 ☔ View full report in Codecov by Sentry. |
carpawell
left a comment
There was a problem hiding this comment.
BTW, not valuable to me as a separate PR. Not clear why we need it, operation params are questionable, the issue does not describe why this should be done (it appeared earlier than the one we are trying to resolve now).
| return nil, fmt.Errorf("stat object file %q: %w", p, err) | ||
| } | ||
| sz := fi.Size() | ||
| if sz > math.MaxInt { |
There was a problem hiding this comment.
Only from garbage in the storage maybe.
There was a problem hiding this comment.
otherwise data cannot be fully read
There was a problem hiding this comment.
but how int64 can be bigger than MaxInt? do we target such systems? can we run on such systems?
There was a problem hiding this comment.
on 32-bit systems
do we target such systems? can we run on such systems?
yes, yes
There was a problem hiding this comment.
for me it is no, no. i had to google examples of such machines (meaning years of their production), also it seems that you have a file/object bigger than such machines can have ram
There was a problem hiding this comment.
machines (meaning years of their production)
this may be at the OS level
anyway, i dont see such limitation in neither docs nor code
| b = make([]byte, sz) | ||
| } | ||
|
|
||
| _, err = io.ReadFull(f, b) |
There was a problem hiding this comment.
u mean return io.Reader? no need for now, may be added later
There was a problem hiding this comment.
we have big problems with memory, i thought that extending the storage interface with row data should be done for memory optimizations mainly
There was a problem hiding this comment.
for memory optimizations
it is: as i mentioned in the commit, in some cases we wanna skip decode-encode round
There was a problem hiding this comment.
i mean reading a full object in memory at once is not good to me in a PR that tries to beat memory problems
There was a problem hiding this comment.
PR that tries to beat memory problems
it doesn't. Main commit says what problem it solves - excessive decode-encode round
tried to explain in the commit. This is 100%-needed functionality for any storage IMO - just get what u store as |
roman-khimov
left a comment
There was a problem hiding this comment.
We need better coverage as well.
| return nil, fmt.Errorf("stat object file %q: %w", p, err) | ||
| } | ||
| sz := fi.Size() | ||
| if sz > math.MaxInt { |
There was a problem hiding this comment.
Only from garbage in the storage maybe.
8654d4f to
06a5d3b
Compare
|
increased test coverage. Fixed old bug on the way |
b306b0d to
5123eda
Compare
|
We need to have some broader context for this. It looks like a beginning of a custom memory allocator which just shouldn't be done. We can mmap things, we can stream things, we can pool some chunks (but it doesn't work well for objects of different sizes unfortunately). If nothing else works in-place allocation is an option too (which we had for years anyway, just in a bit different form), especially given that it will be more effective already (one flat buffer, no decoding). |
|
ok lets use make only for now |
5123eda to
cf86ba6
Compare
|
dropped options |
Previously, the local object storage engine always decoded the requested objects. In cases where it was necessary to obtain a binary object, an excessive decoding-encoding round was performed. Now `GetBytes` method to read object binary is provided. Closes #1723. Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
Previously, `Shard.Get*` methods returned unset meta presence flag if when object was requested and found in the metabase. The documentation did not explain why. Callers do not isolate the case of a cache hit (this is not visible from the return anyway). In total, this behavior was recognized as incorrect and fixed. Refs 69e1e6c. Signed-off-by: Leonard Lyubich <leonard@morphbits.io>
cf86ba6 to
09e385c
Compare
carpawell
left a comment
There was a problem hiding this comment.
Lets try it, but IMO we need object streaming RPC for a correct optimization.
two different topics. Here we cover redundant decoding, while ur talking about smaller bufferization (which is also good to be done, but unrelated to the current PR/issue) |
Not different to me, the main idea is the same.
Im talking about less memory consumption and/or allocations number. RPC that streams an object with partial reading may improve important things by orders of magnitude, while reading the whole object into memory and transferring it as a single message should only make it a few times better. |
No description provided.