Remove Specific Atomic Write Method from BlobContainer by original-brownbear · Pull Request #43329 · elastic/elasticsearch

original-brownbear · 2019-06-18T12:30:05Z

Extracted from Remove BlobContainer APIs that Are not Universally Applicable #42886
Having both write and writeAtomic makes no sense. On Cloud providers all writes are atomic and on FS and HDFS the added cost of a move and directory fsync operation is hardly relevant.
On the other hand, allowing for partial writes (though extremely unlikely in practice) simply has the potential of creating more garbage blobs that aren't even deserializable. This change neatly ensures that every blob by a non-temp name has been fully written and fsynced (in case of NFS + HDFS).

* Extracted from #42886 * Having both write and writeAtomic makes no sense. On Cloud providers all writes are atomic and on FS and HDFS the added cost of a move and directory fsync operation is hardly relevant. * On the other hand, allowing for partial writes (though extremely unlikely in practice) simply has the potential of creating more garbage blobs that aren't even deserializable. This change neatly ensures that every blob by a non-temp name has been fully written and fsynced (in case of NFS + HDFS).

elasticmachine · 2019-06-18T12:30:09Z

Pinging @elastic/es-distributed

tlrx

LGTM, thanks!

original-brownbear · 2019-06-18T15:12:16Z

@tlrx thanks!

@ywelsch can you double check this one please? I think this needs 2 reviewers :)

ywelsch · 2019-06-20T14:58:26Z

I think this brings the risk now of creating more garbage files at the top-level, which are not cleaned up.

original-brownbear · 2019-06-21T16:01:06Z

@ywelsch

I think this brings the risk now of creating more garbage files at the top-level, which are not cleaned up.

I don't think that's true. We are currently not cleaning up top level files anyway, on broken snapshots or deletes we still leak top level snap-{}.dat etc. All this changes is that we'd be leaking temp- blobs instead of non-temp named blobs. If anything I'd say that makes it easier to run cleanup :) the number of blobs def. does not go up imo.

PS: I will open a PR to clean up the top level situation very shortly anyway :)

…write

ywelsch · 2019-07-04T13:51:26Z

I think this brings the risk now of creating more garbage files at the top-level, which are not cleaned up.

I don't think that's true. We are currently not cleaning up top level files anyway, on broken snapshots or deletes we still leak top level snap-{}.dat etc

You're right.

on FS and HDFS the added cost of a move and directory fsync operation is hardly relevant.

On HDFS the cost will be higher. Each metadata request can take a few hundred milliseconds to be processed by a NameNode, so this might make snapshotting on HDFS slower.

Another problem is that the write_once semantics of blobs will be harder to guarantee. The move operation on FS as well as HDFS does not guarantee the same semantics as CREATE_NEW.

The Javadocs for FIles.move say that

If the target file exists then it is
     *     implementation specific if the existing file is replaced or this method
     *     fails by throwing an {@link IOException}.

This is also why we're doing a best-effort check in FSBlobContainer.moveBlobAtomic.

I'm a bit torn on this. On one hand I think removing the extra method is good. On the other hand, it gives us worse guarantees and slower backups on HDFS.

original-brownbear · 2019-07-04T18:13:21Z

On HDFS the cost will be higher. Each metadata request can take a few hundred milliseconds to be processed by a NameNode, so this might make snapshotting on HDFS slower.

Yea that's true, but in the end how relevant is this for overall performance in relative terms?

In the end we're doubling the number (not even because we already use atomic write in a bunch of spots) of metadata requests and keep everything else constant. Especially if we also get #42791 in (parallelize snapshots on the file not the shard level) I think the effect will not be that relevant from the user perspective. But even without I'd be surprised if metadata requests were the bulk of the snapshot cost at the moment and if they aren't then the effect isn't that big.

Another problem is that the write_once semantics of blobs will be harder to guarantee.

See my other PR on this topic: #42886 I don't think we should work with write_once any more due to the difficulties it introduces when it comes to S3 (and some NFS).

original-brownbear · 2019-09-24T08:52:11Z

Closing this as it will become irrelevant as a result of #46250 and follow-ups (since we'll use the CS for consistency of the repository contents) and we can simply remove the atomic-write methods once they become unused.

original-brownbear added >non-issue :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.3.0 labels Jun 18, 2019

original-brownbear requested a review from tlrx June 18, 2019 12:30

original-brownbear mentioned this pull request Jun 18, 2019

Remove BlobContainer APIs that Are not Universally Applicable #42886

Closed

tlrx approved these changes Jun 18, 2019

View reviewed changes

original-brownbear requested a review from ywelsch June 18, 2019 15:11

Merge remote-tracking branch 'elastic/master' into remove-non-atomic-…

efa9eb7

…write

jpountz added v7.4.0 and removed v7.3.0 labels Jul 3, 2019

colings86 added v7.5.0 and removed v7.4.0 labels Aug 30, 2019

original-brownbear closed this Sep 24, 2019

original-brownbear removed :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >non-issue v7.5.0 v8.0.0 labels Sep 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Specific Atomic Write Method from BlobContainer#43329

Remove Specific Atomic Write Method from BlobContainer#43329
original-brownbear wants to merge 2 commits intoelastic:masterfrom
original-brownbear:remove-non-atomic-write

original-brownbear commented Jun 18, 2019

Uh oh!

elasticmachine commented Jun 18, 2019

Uh oh!

tlrx left a comment

Uh oh!

original-brownbear commented Jun 18, 2019

Uh oh!

ywelsch commented Jun 20, 2019

Uh oh!

original-brownbear commented Jun 21, 2019 •

edited

Loading

Uh oh!

ywelsch commented Jul 4, 2019

Uh oh!

original-brownbear commented Jul 4, 2019

Uh oh!

original-brownbear commented Sep 24, 2019 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

original-brownbear commented Jun 18, 2019

Uh oh!

elasticmachine commented Jun 18, 2019

Uh oh!

tlrx left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear commented Jun 18, 2019

Uh oh!

ywelsch commented Jun 20, 2019

Uh oh!

original-brownbear commented Jun 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ywelsch commented Jul 4, 2019

Uh oh!

original-brownbear commented Jul 4, 2019

Uh oh!

original-brownbear commented Sep 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

original-brownbear commented Jun 21, 2019 •

edited

Loading

original-brownbear commented Sep 24, 2019 •

edited

Loading