Skip to content

osd: EC Overwrites#11701

Merged
athanatos merged 41 commits intoceph:masterfrom
athanatos:wip-ec-partial-overwrites
Nov 17, 2016
Merged

osd: EC Overwrites#11701
athanatos merged 41 commits intoceph:masterfrom
athanatos:wip-ec-partial-overwrites

Conversation

@athanatos
Copy link
Contributor

It's passing enough tests now to be worth looking at. For the up-to-the-minute version, see https://github.com/athanatos/ceph/tree/wip-ec-cache.

You'll probably want to start with the updated docs to get an overview. Then, the series has two pieces:

  1. refactor everything to use the new and improved PGTransaction
  2. use the new structure to make overwrites work

@athanatos athanatos force-pushed the wip-ec-partial-overwrites branch from 961dc50 to dd1c5a4 Compare October 31, 2016 17:58
@athanatos athanatos added this to the kraken milestone Oct 31, 2016
@athanatos
Copy link
Contributor Author

qa-suite: ceph/ceph-qa-suite#1227

@athanatos
Copy link
Contributor Author

restest this please

@tchaikov tchaikov self-assigned this Nov 1, 2016
@athanatos athanatos force-pushed the wip-ec-partial-overwrites branch 2 times, most recently from 5134e35 to aa6a273 Compare November 1, 2016 18:10
@athanatos
Copy link
Contributor Author

@athanatos athanatos changed the title DNM: RFC: EC Overwrites EC Overwrites Nov 2, 2016
@athanatos athanatos changed the title EC Overwrites DNM: EC Overwrites Nov 2, 2016
@athanatos
Copy link
Contributor Author

Whoops, I seem to have left a bug in the snap trimmer, fixing.

@athanatos athanatos force-pushed the wip-ec-partial-overwrites branch from aa6a273 to 375d287 Compare November 2, 2016 20:38
}
bool is_fresh_object() const {
return boost::get<Init::None>(&init_type) == nullptr;
}
Copy link
Member

@liewegas liewegas Nov 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole section has me lost. I can't figure out how None, Create, Clone, and Rename are mutually exclusive. Unless None means the object (might) already exist. In which case is_none() vs is_delete() is confusing. What exactly does delete_first mean?

Copy link
Contributor Author

@athanatos athanatos Nov 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_none() && is_delete() indicates that we are deleting an object which already exists and not recreating it. delete_first means that the transaction logically removes the object.

There are really 4 cases:

  1. We are modifying an existing object (is_none() && !is_delete()) -- rollforward (actually, with the current implementation, it'll still use the other rollback machinery as long as there aren't overwrites).
  2. We are removing and then recreating an object (!is_none() && is_delete()) -- stash, rollback
  3. We are removing an object (is_none() && is_delete()) -- stash, rollback
  4. We are creating an object (!is_none() && !is_delete()) -- create (no stash), rollback

Create, Clone, Rename are the three ways we can recreate it. It's confusing, but the ECBackend transaction planning needs this context to figure out how to perform the transaction.

Copy link
Member

@liewegas liewegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bunch of 'dnm's here

src/osd/PGLog.h Outdated
log.head = info.last_update;
log.reset_rollback_info_trimmed_to_riter();

// TODO SAM DNM: rollforward log entries will be problematic since the on-disk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dnm

src/osd/PG.cc Outdated
state_set(PG_STATE_ACTIVATING);
}
if (acting.size() >= pool.info.min_size) {
/* TODOSAM DNM: verify that this will apply before reads are accepted
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dnm

@athanatos athanatos force-pushed the wip-ec-partial-overwrites branch 2 times, most recently from 9448ad8 to e5b1d9c Compare November 3, 2016 19:40
if (requires_rollforward(projected_size, i.second)) {
plan.rollforward = true;
} else if (requires_inplace(projected_size, i.second)) {
assert(!requires_rollforward(projected_size, i.second));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant given if/else

src/osd/PGLog.h Outdated
void add(const pg_log_entry_t& e) {
void add(const pg_log_entry_t& e, bool applied = true) {
if (!applied) {
get_can_rollback_to() == head;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/==/=/?

@athanatos
Copy link
Contributor Author

Removed the DNMs and fixed a bug I introduced yesterday.

{
assert(is_active());
assert(is_primary());
if (is_clean() && snap_trimq.empty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!snap_trimq.empty()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gah, just did that yesterday.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@athanatos athanatos force-pushed the wip-ec-partial-overwrites branch 3 times, most recently from 25855e2 to cbf38f7 Compare November 4, 2016 00:02

This works as long as we only need to remember one "owner" for
each extent. To make this work, we'll need to leverage some
invariants guarranteed by higher layers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guarranteed -> guaranteed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Also, rollforward on activate() and adjust read_log debugging to
account for non-rollforward entries.

Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
It was hard to reason about the validity of the IndexedLog internal
pointers and iterators during updates, so this patch cleans that up
a bunch.  It also moves responsibility for doing rollbacks into
PGBackend.  Finally, it adds support for the new log entry format.

Signed-off-by: Samuel Just <sjust@redhat.com>
Implements the rmw pipeline and integrates the cache.

HashInfo now maintains a projected size for use during the planning
phase of the pipeline.

(Doesn't build without subsequent patches, not worth stubbing out
the interfaces)

Signed-off-by: Samuel Just <sjust@redhat.com>
…eltas

The RMW pipeline means that we don't start committing an update
immediately, so we can't update the log syncronously with
submit_transaction.  Thus, in order to pipeline writes, PG/ReplicatedPG
will need to project last_update and abstain from updating info
directly (updating info.stats was the only offender).

Signed-off-by: Samuel Just <sjust@redhat.com>
…it order

Without this change, we might submit new log entries for marking objects
unfound in a way that causes replicas to process them out-of-order with
pending writes with lower version numbers.  That would be bad.  Instead,
add an interface to allow an arbitrary callback to be called after any
previously submitted transaction commit, but before any subsequently
submitted operations commit.

Signed-off-by: Samuel Just <sjust@redhat.com>
…ck call

If the read can be completed immediately, objects_read_async will call
the callback syncronously, which will result in ctx being cleaned up.
Clear pending_async_reads before the call.

Signed-off-by: Samuel Just <sjust@redhat.com>
…d pools

Signed-off-by: Samuel Just <sjust@redhat.com>
Previously, we used an empty transaction to indicate when we
were sending the op to a backfill peer which needs the logs,
but can't run the transaction.  I'd like to be able to send
and empty transaction for the rollforward side effect without
it causing the peer to think it missed a backfill op, so
instead, use an explicit flag.  Compatability is handled by
interpretting an old version encoding with an empty transaction
as having the backfill field filled.

Signed-off-by: Samuel Just <sjust@redhat.com>
With the PGBackend changes, it's not necessarily the case that
calling simple_opc_submit syncronously updates the SnapMapper.
Thus, we can't rely on being able to just ask the snap mapper
for the next object immediately (we could well loop on the same
one if ECBackend is flushing the pipeline).  Instead, update
SnapMapper and the SnapTrimmer to grab N at a time.

Additionally, we need to make sure we don't try this again until
all of the previously submitted repops are flushed (a good idea
anyway).  To that end, this patch also refactors the SnapTrimmer
machine to be fully explicit about why it's blocked so we can be
sure that we don't queue an async work item unless we really
want to.

Signed-off-by: Samuel Just <sjust@redhat.com>
Also, clean up some old ones.

Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
…ed log entries

Log entries don't get added to the log for ECBackend until reads are
done, yet we still want any other requests with the same id to wait.

ReplicatedPG::update_range should consider the projected log as well.

Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
Signed-off-by: Samuel Just <sjust@redhat.com>
@athanatos athanatos force-pushed the wip-ec-partial-overwrites branch from fe957f0 to 0cf383d Compare November 17, 2016 18:43
@athanatos athanatos merged commit 64cc94c into ceph:master Nov 17, 2016
@athanatos athanatos deleted the wip-ec-partial-overwrites branch November 17, 2016 18:54
yuriw pushed a commit to yuriw/ceph that referenced this pull request Apr 17, 2024
Back in PR ceph#11701 when EC Overwrites were added, there was significant churn in the ECTransaction code.  Several asserts were added in generate_transactions, but the code changed several times and it appears some of those asserts don't make sense in the final version of the PR.

This PR removes two of those asserts.  The first doesn't appear to be necessary given how the surrounding conditional block changed.   The second appears to be an incorrect assert that was left over from an earlier commit that no longer should be in place given the logic that was added to handle truncate (which roughly mirrors what we do in ReplicatedBackend).

It's possible the assert for obc and entry are also unnecessary, however I left those in place for now.

Fixes: https://tracker.ceph.com/issues/65509

Signed-off-by: Mark Nelson <mark.nelson@clyso.com>
k0ste pushed a commit to k0ste/ceph that referenced this pull request Aug 10, 2024
Back in PR ceph#11701 when EC Overwrites were added, there was significant churn in the ECTransaction code.  Several asserts were added in generate_transactions, but the code changed several times and it appears some of those asserts don't make sense in the final version of the PR.

This PR removes two of those asserts.  The first doesn't appear to be necessary given how the surrounding conditional block changed.   The second appears to be an incorrect assert that was left over from an earlier commit that no longer should be in place given the logic that was added to handle truncate (which roughly mirrors what we do in ReplicatedBackend).

It's possible the assert for obc and entry are also unnecessary, however I left those in place for now.

Fixes: https://tracker.ceph.com/issues/65509

Signed-off-by: Mark Nelson <mark.nelson@clyso.com>
(cherry picked from commit c6eb35b)
k0ste pushed a commit to k0ste/ceph that referenced this pull request Aug 10, 2024
Back in PR ceph#11701 when EC Overwrites were added, there was significant churn in the ECTransaction code.  Several asserts were added in generate_transactions, but the code changed several times and it appears some of those asserts don't make sense in the final version of the PR.

This PR removes two of those asserts.  The first doesn't appear to be necessary given how the surrounding conditional block changed.   The second appears to be an incorrect assert that was left over from an earlier commit that no longer should be in place given the logic that was added to handle truncate (which roughly mirrors what we do in ReplicatedBackend).

It's possible the assert for obc and entry are also unnecessary, however I left those in place for now.

Fixes: https://tracker.ceph.com/issues/65509

Signed-off-by: Mark Nelson <mark.nelson@clyso.com>
(cherry picked from commit c6eb35b)
k0ste pushed a commit to k0ste/ceph that referenced this pull request Aug 10, 2024
Back in PR ceph#11701 when EC Overwrites were added, there was significant churn in the ECTransaction code.  Several asserts were added in generate_transactions, but the code changed several times and it appears some of those asserts don't make sense in the final version of the PR.

This PR removes two of those asserts.  The first doesn't appear to be necessary given how the surrounding conditional block changed.   The second appears to be an incorrect assert that was left over from an earlier commit that no longer should be in place given the logic that was added to handle truncate (which roughly mirrors what we do in ReplicatedBackend).

It's possible the assert for obc and entry are also unnecessary, however I left those in place for now.

Fixes: https://tracker.ceph.com/issues/65509

Signed-off-by: Mark Nelson <mark.nelson@clyso.com>
(cherry picked from commit c6eb35b)
kshtsk pushed a commit to kshtsk/ceph that referenced this pull request Aug 14, 2024
Back in PR ceph#11701 when EC Overwrites were added, there was significant churn in the ECTransaction code.  Several asserts were added in generate_transactions, but the code changed several times and it appears some of those asserts don't make sense in the final version of the PR.

This PR removes two of those asserts.  The first doesn't appear to be necessary given how the surrounding conditional block changed.   The second appears to be an incorrect assert that was left over from an earlier commit that no longer should be in place given the logic that was added to handle truncate (which roughly mirrors what we do in ReplicatedBackend).

It's possible the assert for obc and entry are also unnecessary, however I left those in place for now.

Fixes: https://tracker.ceph.com/issues/65509

Signed-off-by: Mark Nelson <mark.nelson@clyso.com>
(cherry picked from commit c6eb35b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants