Skip to content

rgw/cloudtier: initial MVP of the feature RestoreObject from cloud#59311

Merged
soumyakoduri merged 3 commits intoceph:mainfrom
soumyakoduri:wip-skoduri-cloud-restore
Oct 3, 2024
Merged

rgw/cloudtier: initial MVP of the feature RestoreObject from cloud#59311
soumyakoduri merged 3 commits intoceph:mainfrom
soumyakoduri:wip-skoduri-cloud-restore

Conversation

@soumyakoduri
Copy link
Copy Markdown
Contributor

@soumyakoduri soumyakoduri commented Aug 19, 2024

This PR covers initial PoC for the feature to restore objects from the cloud-tier on demand. Details below -

Feature doc: https://docs.google.com/document/d/1rzLJAzHK6cLuzJswgoplgugNOFCKi8NrjPOGfgVrIFg/edit
Updated to - src/doc/rgw/cloud-restore.md in this PR
Fixes: https://tracker.ceph.com/issues/67679

Restore CLI parsing
Added restore cli parameters and xml parsing for restore-object command
Current commit has basic cli checks with their corresponding response, cli parameters like Days.

Response elements:
For first and repetitive request 202 Accepted will be corresponding response code.
For CloudRestored status 200 OK will be corresponding response code.
For conflicting requests 409 Conflict corresponding response code.

TODO:

  • Improve response messages and error codes.
  • Fix empty x-amz-restore header in the response
  • For temporary restored objects, if the request is retried, update expiry_date

GET/RESTORE : calling cloud restore APIs to fetch the object from the remote endpoint.
What are all supported :
It allows read-through for cloud-tiered objects
New tier config options user need to set allow_read_through to true and read_through_restore_days more than 1 for this feature to work, also objects with retain_head_object will be available for this feature.
First get request will fail with restoring in progress error, objects are downloaded asynchronously.
The objects restored are temporary.

TODO :

  • implement call backs
  • HEAD should return restore-status in the x-amz-restore header

CloudTier module:

  • Given <bucket,object,cloudtier-storage-class> & optional<days>, fetch the object from the cloud endpoint and copy it to STANDARD storage-class
  • if <days> provided, and
    a) > 0, the restore is marked temporary with expiry date, but
    b) = 0, restore is not done and an error is returned
  • Without <days>, it is considered as permanent restore.
  • For temporary restore, return original cloudtier storage-class for x-amz-storage-class .
  • New attrs are added to store RestoreStatus and RestoreType
  • Use Object Expirer to delete the temporarily restored objects.
  • For temporarily restored objects, set delete_at attr to the expiration time.
    This will add those objects to ObjectExpirer list. Use LC worker thread to scan that list and delete expired objects. By delete here, it means to delete restored object data and reset HEAD object as Cloud-transitioned object as it was before restore.

In addition below changes are done -

  • If temporary, object is still marked RGWObj::CloudTiered and mtime is set same as
    transition time.
  • If permanent, object is marked RGWObj::Main and mtime is set to restore time (now()).
  • rgw_restore_debug_interval option added to set configure restore Days (similar to rgw_lc_debug_interval)

TODO:

  • Recheck on s3cmd attrs and any other attr related changes

Not addressed in this PR:

  • Replication
  • versioned objects
  • Compression/encryption
  • RGW Service restarts
  • Bulk Restore
  • Unit-tests

Signed-off-by: Shreyansh Sancheti ssancheti@ibm.com
Signed-off-by: Jiffin Tony Thottan thottanjiffin@gmail.com
Signed-off-by: Soumya Koduri skoduri@redhat.com

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

Copy link
Copy Markdown
Contributor

@dang dang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! A few tiny nits.

Comment thread src/rgw/driver/rados/rgw_rados.cc Outdated
Comment thread src/rgw/driver/rados/rgw_sal_rados.cc Outdated
@soumyakoduri
Copy link
Copy Markdown
Contributor Author

Looks great! A few tiny nits.

Thanks Dan. Addressed your comments and also fixed a bug in ObjectExpirer code.

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test make check

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test make check arm64

@soumyakoduri soumyakoduri force-pushed the wip-skoduri-cloud-restore branch from 939b59e to 78c529a Compare August 26, 2024 06:10
thotz
thotz previously requested changes Aug 27, 2024
Comment thread src/doc/rgw/cloud-restore.md
Comment thread src/doc/rgw/cloud-restore.md Outdated
// Read ACLOwner maybe ?
}

ldpp_dout(tier_ctx.dpp, 20) << __func__ << "(): Sucessfully fetched object from cloud bucket:" << dest_bucket << ", object: " << target_obj_name << dendl;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ldpp_dout(tier_ctx.dpp, 20) << __func__ << "(): Sucessfully fetched object from cloud bucket:" << dest_bucket << ", object: " << target_obj_name << dendl;
ldpp_dout(tier_ctx.dpp, 20) << __func__ << "(): Successfully fetched object from cloud bucket:" << dest_bucket << ", object: " << target_obj_name << dendl;

Comment thread src/rgw/driver/rados/rgw_rados.cc
Comment thread src/rgw/driver/rados/rgw_rados.cc Outdated
ret = processor.complete(accounted_size, etag, &mtime, set_mtime,
attrs, rgw::cksum::no_cksum, delete_at , nullptr, nullptr, nullptr,
(rgw_zone_set *)&zone_set, &canceled, rctx, log_op ? rgw::sal::FLAG_LOG_OP : 0);
if (ret < 0) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

Comment thread src/rgw/rgw_op.cc
ldpp_dout(dpp, 5) << "restore is not completed yet, please retry again" << dendl;
s->err.message = "restore is not completed yet";
}
return op_ret;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return op_ret;

Comment thread src/rgw/rgw_op.cc
Comment on lines +1053 to +1060
} else { // CloudRestored..return success
return 0;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else { // CloudRestored..return success
return 0;
}
return op_ret;

Comment thread src/rgw/rgw_op.cc
} else {
ldpp_dout(this, 20) << " manifest not found" << dendl;
}
ldpp_dout(this, 20) << "completed restore" << dendl;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of the errors are alredy handled in the handle_cloud_object_tier, it will reductant IMO

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess only handle_cloudtier is needed there then.

Comment thread src/rgw/rgw_lc.cc
if (ret < 0) {
ldpp_dout(this, 5) << "RGWLC::process_expire_objects: failed, "
<< " worker ix: " << worker->ix << dendl;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return ret;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be failing LC process for any failure with expiring temp objects here.

Comment thread src/rgw/rgw_rest_s3.cc

if (attr_iter == attrs.end() || restore_status != rgw::sal::RGWRestoreStatus::None) {
s->err.http_ret = 202; //Accepted
dump_header(s, "x-amz-restore", rgw_bl_str(restore_status));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here the header can be empty right, is it right to provide empty values for header?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadObject.html#API_HeadObject_ResponseElements

x-amz-restore
If the object is an archived object (an object whose storage class is GLACIER), the response includes this header if either the archive restoration is in progress (see RestoreObject or an archive copy is already restored.

If an archive copy is already restored, the header value indicates when Amazon S3 is scheduled to delete the object copy. For example:

x-amz-restore: ongoing-request="false", expiry-date="Fri, 21 Dec 2012 00:00:00 GMT"

If the object restoration is in progress, the header returns the value ongoing-request="true".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expiry-date should be set for only temporary copies.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay got it, so in case of read_through at the moment it will be temporary objects only.

}

int RadosObject::restore_obj_from_cloud(Bucket* bucket,
rgw::sal::PlacementTier* tier,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yesterday while fixing tier config issue, I was wondering whether we need both tierconf and Placementtier is required for this function

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right..I was wondering the same.. now that tier_config is updated with latest Zonegroup tier info, we can avoid this second argument.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Sep 9, 2024

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@soumyakoduri soumyakoduri force-pushed the wip-skoduri-cloud-restore branch 3 times, most recently from d162331 to 8852379 Compare September 22, 2024 07:02
@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test windows

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test make check arm64

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test windows

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

@mattbenjamin @dang @cbodley .. all the review comments so far have been addressed and code is rebased to latest main.
teuthology results have passed - http://pulpito.front.sepia.ceph.com/soumyakoduri-2024-09-22_06:37:27-rgw-wip-skoduri-cloud-restore-distro-default-smithi

Please review and let us know if this PR can be merged. Thanks!

@soumyakoduri soumyakoduri requested a review from thotz September 22, 2024 18:14
@soumyakoduri soumyakoduri dismissed thotz’s stale review September 22, 2024 18:15

addressed in latest code.

@mattbenjamin
Copy link
Copy Markdown
Contributor

could we maybe squash some of these, keeping summarizing the squashed commits in the commit mesg?

@soumyakoduri soumyakoduri force-pushed the wip-skoduri-cloud-restore branch from 8852379 to 54873d9 Compare September 25, 2024 04:39
@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test api

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

could we maybe squash some of these, keeping summarizing the squashed commits in the commit mesg?

Done. Please review!

@soumyakoduri soumyakoduri force-pushed the wip-skoduri-cloud-restore branch 2 times, most recently from a9707f6 to acdf239 Compare September 27, 2024 15:47
@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test api

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jekins test make check arm64

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

teuthology results: http://pulpito.front.sepia.ceph.com/soumyakoduri-2024-09-27_07:57:10-rgw-wip-skoduri-cloud-restore-distro-default-smithi

One test (rgw:multisite) failed which seems unrelated to this PR. I see that test fail without these changes too.

@soumyakoduri soumyakoduri force-pushed the wip-skoduri-cloud-restore branch from acdf239 to a1a0ead Compare September 30, 2024 05:58
@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins retest this please

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test api

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test make check arm64

1 similar comment
@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test make check arm64

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

@mattbenjamin , do you have any further comments on this PR? If not can we merge this PR? There are couple of other WIP issues to be reviewed & fixed which are blocked and waiting for this PR to be merged. If there are n't any major comments, we can open another PR to address them if needed.

Copy link
Copy Markdown
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this PR ready for testing by QE? found cosmetic spacing stuff in esp. first commit in the series, but this is not critical

Comment thread src/rgw/driver/daos/rgw_sal_daos.h Outdated
Comment thread src/rgw/driver/posix/rgw_sal_posix.h Outdated
Comment thread src/rgw/driver/rados/rgw_sal_rados.cc Outdated
@soumyakoduri soumyakoduri force-pushed the wip-skoduri-cloud-restore branch from a1a0ead to ca665a4 Compare October 3, 2024 02:34
@soumyakoduri
Copy link
Copy Markdown
Contributor Author

soumyakoduri commented Oct 3, 2024

is this PR ready for testing by QE? found cosmetic spacing stuff in esp. first commit in the series, but this is not critical

yes and its already tested by QE in the downstream.

soumyakoduri and others added 3 commits October 3, 2024 13:58
1)Add functionality to restore cloud-transitioned objects on demand.

Current commit has below -
* Given <bucket,object>, fetch the object from the cloud endpoint.
* if days provided and > 0, the restore is marked temporary with expiry date.
* Without <days>, it is marked as permanent restore.

2)Use ObjectExpirer/delete_at attr to delete temp objects

For temporarily restored objects, set delete_at attr to the expiration time.
This will add those objects to ObjectExpirer list. Use LC worker thread to
scan that list and delete expired objects. By delete here, it means to delete
restored object data and reset HEAD object as Cloud-transitioned object as it
was before restore.

In addition below changes are done -
* If temporary, object is still marked RGWObj::CloudTiered and mtime is set same as
transition time.
* If permanent, object is marked RGWObj::Main and mtime is set to restore time (now()).
* rgw_restore_debug_interval option added to set configure restore Days (similar to rgw_lc_debug_interval)

There is an issue with ObjectExpirer code where in if an object is added
to ObjectExpirer list and is re-written, it is not deleted from the expirer list
and hence the new object may get deleted. Fixed the same and also addressed
minor review comments.

3)Design doc added

4) ObjCategory should be set to CloudTiered only for cloud-transitioned
objects and temporarily restored objects. Permanent copies are to be
treated as regular objects.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
What are all supported :

* It allows read-through for cloud-tiered objects via restore_obj_from_cloud
* New tier config options user need to set allow_read_through to true and
  read_through_restore_days more than 1 for this feature to work, also
  objects with retain_head_object will be available for this feature.
* First get request will fail with restoring in progress error, objects
  are downloaded asynchronously.
* The objects restore are temporary.
* Tested `aws s3api get-object`, `aws s3api head-object` and `aws s3 cp`

In addition send timeout errors for first readthrough request

Also addressed lint warning and other cleanup(review comments)

Signed-off-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
* For first and repititive request 202 Accepted will be corresponding response code.
* For CloudRestored status 200 OK will be corresponding response code.
* For conflicting requests 409 Conflict corresponding response code.

Also Fixed storage class update while listing objects.

Earlier while restoring object temporarily list-objects (s3api) and
radosgw-admin bucket list didn't have updated storage class. With this
fixed it now has the cloudtier storage class.

Signed-off-by: shreyanshjain7174 <ssanchet@redhat.com>
@soumyakoduri soumyakoduri force-pushed the wip-skoduri-cloud-restore branch from ca665a4 to 1b66bc5 Compare October 3, 2024 08:29
Copy link
Copy Markdown
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much appreciated cleanup!

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test make check

@soumyakoduri
Copy link
Copy Markdown
Contributor Author

jenkins test make check arm64

@soumyakoduri soumyakoduri merged commit d53ffb1 into ceph:main Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants