cephfs: implement snapdiff via fake .snap subfolder [RFC]#42517
cephfs: implement snapdiff via fake .snap subfolder [RFC]#42517ghost wants to merge 2 commits intoceph:masterfrom
Conversation
|
@denisb-croit I'll start taking a look at this soon (starting sometime next week). This would be extremely useful for cephfs-mirror. BTW, @mchangir is working on fixing recursive timestamps in CephFS for use with mirror dameon. |
|
Leaving my initial thoughts since I'm still going through the changes. @denisb-croit Did you consider having a |
I agree that 'tilda' logic looks a bit ugly. But this provides the major benefit - an ability to use regular file management tools to access the diff. |
The immediate use of snap-diff would be cephfs mirror daemon which currently has to walk the entire directory tree to figure out changes (based on [mc]time). This snap-diff would be immensely useful to cut down the walk by only traversing those directories which have some update underneath it. Thereby, my proposal for having something like Listing entries via fake subdir is handy for regular fs tools, however, having a nice clean interface via libcephfs would be immensely useful for applications. |
Right, in our case we would prefer to use regular file-system access to SnapDiff though. Backend implementation to be refactored in a way to prepare results in readdir_diff API format and expose that API. |
Yep -- that's where I am coming from.
ACK. |
|
@denisb-croit Would be good to have tests too (whenever you plan to update next). |
never mind -- didn't realize you already had a commit for the test. |
vshankar
left a comment
There was a problem hiding this comment.
@denisb-croit I'll continue to play with this. Feel free to update the PR with our discussion (readdir_diff, etc..).
src/mds/SnapRealm.cc
Outdated
| string pname; | ||
| inodeno_t pino; | ||
| if (n.length() && n[0] == '_') { | ||
| char first_char = n.length() ? n[0] : 0; |
src/mds/SnapRealm.cc
Outdated
| } | ||
|
|
||
| snapid_t SnapRealm::resolve_snapname(std::string_view n, inodeno_t atino, snapid_t first, snapid_t last) | ||
| std::tuple<snapid_t, bool, snapid_t> SnapRealm::resolve_snapname( |
There was a problem hiding this comment.
How about returning something like a struct SnapRealInfo?
There was a problem hiding this comment.
.. and BTW, you could parse the special diff keyword snapshot name and invoke ->resolve_snapname() with the respective snapshot name(s). That way, the actual parse logic moves out of SnapRealm to the caller (during path traversal and/or readdir_diff).
There was a problem hiding this comment.
Good point!
Refactored this part..
| int r = 0; | ||
| C_SaferCond onfinish("Client::_read_async flock"); | ||
| r = objectcacher->file_read(&in->oset, &in->layout, in->snapid, | ||
| r = objectcacher->file_read(&in->oset, &in->layout, snapid, |
There was a problem hiding this comment.
I'm not 100% clear about the usage of listing file contents for these synthetic snap ids? What is the use-case for this?
There was a problem hiding this comment.
One can read the content of the file's shapshot listed by snapdiff request.
Hence e.g. for the new/updated files backup software can easily get the new content without reading through the regular snapshot dir structure...
For deleted entries that's rather an overkill but it's provided for the sake of uniformity
There was a problem hiding this comment.
Hmm.. Although that's a craft technique, I a bit wary about it. @batrick what do you think?
| std::swap(s1, s2); | ||
| } | ||
| snapid_t res = (s1 & CEPH_SNAPDIFF_ID_MASK) << CEPH_SNAPDIFF_ID_BITS; | ||
| res = res | (s2 & CEPH_SNAPDIFF_ID_MASK); |
There was a problem hiding this comment.
Would generating the synthetic snap-id this way be safe (for the mds and/or client)? Could we run into some sort of "id collision" between this (synthetic) and a real snap-id?
There was a problem hiding this comment.
Theoretically this can cause such a collision when 2^32 snapshots are created.
In practice this would take ~136 years to make that many snapshots at 1 snapshot per second pace.
Hence this looks safe enough.
On the other hand it looks like alternative approaches would require dramatic modifications to the code which do not worth the effort IMO
src/client/Client.cc
Outdated
| int Client::fill_stat(Inode *in, struct stat *st, frag_info_t *dirstat, nest_info_t *rstat) | ||
| { | ||
| ldout(cct, 10) << __func__ << " on " << in->ino << " snap/dev" << in->snapid | ||
| ldout(cct, 10) << __func__ << " on " << in->ino << " snap/dev " << in->snapid |
| f->close_section(); | ||
| } | ||
|
|
||
| void Server::_readdir_diff( |
There was a problem hiding this comment.
This is a large routine and should be split into smaller callable parts.
There was a problem hiding this comment.
Will do that when implementing explicit readdir_snapdiff API
There was a problem hiding this comment.
Did you miss updating this part in the latest push?
fc26056 to
5a8ec2c
Compare
|
@denisb-croit Thanks for the update. I'll take a look sometime this week. |
This patch allows to obtain snapshots' file delta (aka Snap Diff) by reading fake 'snapdiff-query-formatted' subfolders under .snap directory. Snapdiff subfolders are not visible when reading from .snap folder, one has to build and issue such a "query" manually. Resulting output (directory listing) contains just entries which have been altered (created/updated/removed) in the final shapshot since the initial one. New/updated entries are presented as regular files, names of the removed ones are prefixed with tilda '~'. E.g. to compare snapshots named snap1 and snap2 one can issue: >ls -l /mnt/mycephfs/dir0/.snap/.~diff=snap1.~diff=snap2 which would return something like that: total 8 -rw-r--r-- 1 root root 3 Jul 19 16:40 b -rw-r--r-- 1 root root 3 Jul 19 16:40 ~c drwxr-xr-x 0 root root 0 Jul 19 16:40 ~C -rw-r--r-- 1 root root 3 Jul 19 16:40 d -rw-r--r-- 1 root root 3 Jul 19 16:40 f -rw-r--r-- 1 root root 3 Jul 19 16:40 ~g drwxr-xr-x 0 root root 0 Jul 19 16:40 ~G drwxr-xr-x 0 root root 0 Jul 19 16:40 I -rw-r--r-- 1 root root 3 Jul 19 16:40 k drwxr-xr-x 0 root root 0 Jul 19 16:40 K -rw-r--r-- 1 root root 3 Jul 19 16:40 l drwxr-xr-x 4 root root 12 Jul 19 16:41 L drwxr-xr-x 2 root root 6 Jul 19 16:40 S drwxr-xr-x 2 root root 3 Jul 19 16:40 T or > ls -l /mnt/mycephfs/dir0/.snap/.~diff=snap1.~diff=snap3 total 7.5K -rw-r--r-- 1 root root 3 Jul 19 16:40 a -rw-r--r-- 1 root root 3 Jul 19 16:40 b -rw-r--r-- 1 root root 3 Jul 19 16:40 ~c drwxr-xr-x 0 root root 0 Jul 19 16:40 ~C -rw-r--r-- 1 root root 3 Jul 19 16:40 d -rw-r--r-- 1 root root 3 Jul 19 16:40 ~f -rw-r--r-- 1 root root 3 Jul 19 16:40 g drwxr-xr-x 0 root root 0 Jul 19 16:40 ~G drwxr-xr-x 0 root root 0 Jul 19 16:41 G drwxr-xr-x 2 root root 3 Jul 19 16:40 H drwxr-xr-x 0 root root 0 Jul 19 16:40 I -rw-r--r-- 1 root root 3 Jul 19 16:40 l drwxr-xr-x 4 root root 12 Jul 19 16:41 L drwxr-xr-x 2 root root 6 Jul 19 16:40 S drwxr-xr-x 2 root root 3 Jul 19 16:40 T then diving deeper in the subfolder might show: > ls -l /mnt/mycephfs/dir0/.snap/.~diff=snap1.~diff=snap2/~C total 1 drwxr-xr-x 0 root root 0 Jul 19 16:40 ~C1 -rw-r--r-- 1 root root 3 Jul 19 16:40 ~cc1 and so on and so forth: > ls -l /mnt/mycephfs/dir0/.snap/.~diff=snap1.~diff=snap2/~C/~C1 total 1 -rw-r--r-- 1 root root 6 Jul 19 16:40 ~c2 File content reading is also available. It returns the full(!) file content in the target snapshot for new/updated files and one in the initial snapshot for removed files. E.g. > less /mnt/mycephfs/dir0/.snap/.~diff=snap1.~diff=snap2/~C/~C1/~c2 snap1 Order of snapshot names in a snapdiff ""query" isn't important - they're properly sorted properly according to their ids when processed. Comparing snapshot and live data isn't supported. Byte-level "deltas" are not supported. Signed-off-by: Denis Barahtanov denis.barahtanov@croit.io
Signed-off-by: Denis Barahtanov denis.barahtanov@croit.io
I could not complete the review this week. Sorry! Will finish it up next week. |
@denisb-croit Did you miss pushing this change as part of the update or do you plan to do this as a follow-up? |
Changes are provided in the different PR #43328. |
Nice. Will take a look and do some tests... |
|
This has been superseded by #43546 |
This patch allows to obtain snapshots' file delta (aka Snap Diff) by
reading fake 'snapdiff-query-formatted' subfolders under .snap directory.
Snapdiff subfolders are not visible when reading from .snap folder, one
has to build and issue such a "query" manually.
Resulting output (directory listing) contains just entries which have
been altered (created/updated/removed) in the final shapshot since the
initial one. New/updated entries are presented as regular files, names
of the removed ones are prefixed with tilda '~'.
E.g. to compare snapshots named snap1 and snap2 one can issue:
which would return something like that:
total 8
-rw-r--r-- 1 root root 3 Jul 19 16:40 b
-rw-r--r-- 1 root root 3 Jul 19 16:40 ~c
drwxr-xr-x 0 root root 0 Jul 19 16:40 ~C
-rw-r--r-- 1 root root 3 Jul 19 16:40 d
-rw-r--r-- 1 root root 3 Jul 19 16:40 f
-rw-r--r-- 1 root root 3 Jul 19 16:40 ~g
drwxr-xr-x 0 root root 0 Jul 19 16:40 ~G
drwxr-xr-x 0 root root 0 Jul 19 16:40 I
-rw-r--r-- 1 root root 3 Jul 19 16:40 k
drwxr-xr-x 0 root root 0 Jul 19 16:40 K
-rw-r--r-- 1 root root 3 Jul 19 16:40 l
drwxr-xr-x 4 root root 12 Jul 19 16:41 L
drwxr-xr-x 2 root root 6 Jul 19 16:40 S
drwxr-xr-x 2 root root 3 Jul 19 16:40 T
or
then diving deeper in the subfolder might show:
and so on and so forth:
File content reading is also available. It returns the full(!) file
content in the target snapshot for new/updated files and one in the
initial snapshot for removed files.
E.g.
Order of snapshot names in a snapdiff ""query" isn't important - they're
properly sorted properly according to their ids when processed.
Comparing snapshot and live data isn't supported. Byte-level "deltas"
are not supported.
Signed-off-by: Denis Barahtanov denis.barahtanov@croit.io