Skip to content

dedupe: Skip extents with insufficient duplicates#365

Merged
JackSlateur merged 1 commit intomarkfasheh:masterfrom
ticpu:master
Feb 7, 2025
Merged

dedupe: Skip extents with insufficient duplicates#365
JackSlateur merged 1 commit intomarkfasheh:masterfrom
ticpu:master

Conversation

@ticpu
Copy link
Contributor

@ticpu ticpu commented Feb 3, 2025

De-duping my Unity Avatar's library on bcachefs always resulted in aborts of this type:

❯ gdb --args duperemove -dr --io-threads=1 --hashfile=".duperemove.hash" *
Gathering file list...
[New Thread 0x7ffff41ff6c0 (LWP 221298)]

process_extents: unable to get extent
file /mnt/bcachefs/windowsvm-share/3DModels/WickerBapper-v2.2/Library/ArtifactDB-lock changed
process_extents: unable to get extent
file /mnt/bcachefs/windowsvm-share/3DModels/WickerBapper-v2.2/Library/SourceAssetDB-lock changed
process_extents: unable to get extent
file /mnt/bcachefs/windowsvm-share/3DModels/WickerBapper-v2.2/Library/SourceAssetDB changed
process_extents: unable to get extent
file /mnt/bcachefs/windowsvm-share/3DModels/WickerBapper-v2.2/Library/ArtifactDB changed
    Files scanned: 4/4 (100.00%)
    Bytes scanned: 25182208/25182208 (100.00%)
    File listing: completed
[Thread 0x7ffff39fe6c0 (LWP 221299) exited]
Hashfile ".duperemove.hash" written
Loading only identical files from hashfile.
Simple read and compare of file data found 5 instances of files that might benefit from deduplication.
Showing 6 identical files of length 2488 with id 196d52ba
Start		Filename
0	"/mnt/bcachefs/windowsvm-share/3DModels/NovaBeastV3.3/Packages/com.poiyomi.toon/_PoiyomiShaders/Scripts/ThryEditor/Resources/thryEditor_link.png.meta"
...
[0x507000008ec0] Skipping - extents are already deduped.
ERROR: run_dedupe.c:287
[stack trace follows]
/usr/lib/libasan.so.8(___interceptor_backtrace+0xa4) [0x7ffff794ca14]
/usr/bin/duperemove(+0x41b54) [0x555555595b54]
/usr/bin/duperemove(+0x3dcda) [0x555555591cda]
/usr/bin/duperemove(+0x3f140) [0x555555593140]
/usr/bin/duperemove(+0x3f4c4) [0x5555555934c4]
/usr/lib/libglib-2.0.so.0(+0x95ca3) [0x7ffff77e9ca3]
/usr/lib/libglib-2.0.so.0(+0x92be6) [0x7ffff77e6be6]
/usr/lib/libasan.so.8(+0x5d10a) [0x7ffff790110a]
/usr/lib/libc.so.6(+0x942ce) [0x7ffff73c32ce]
/usr/lib/libc.so.6(+0x11929c) [0x7ffff744829c]

I had to add a safety check in push_extents() to prevent those crashes from randomly happening.

This prevents assertion failures in dedupe_extent_list() when the duplicate count drops below 2 between the initial duplicate detection and worker thread processing. Even when reducing I/O thread count I did encounter this on my setup. Maybe because I have many CPU cores? (Ryzen 9 7950X3D / 32 threads).

I'm not sure I understand correctly why the problem happened in the first place so I'll let that to you to review.

Note: Many of the lines above say the file has changed, but those are archives, nothing should change in those folders.

@ticpu
Copy link
Contributor Author

ticpu commented Feb 3, 2025

A second crash for reference.

Start		Filename
135168	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
331776	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
462848	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
593920	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
856064	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
921600	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
1183744	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
1445888	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
1511424	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
1642496	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
1773568	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
2232320	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
2494464	"Library/Artifacts/fb/fb24a5a57808ecbe83fa1ac2ba092f15"
65536	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
217088	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
282624	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
544768	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
610304	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
741376	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
872448	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
1048576	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
1331200	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
1593344	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
1789952	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
1921024	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
2314240	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
2445312	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
2576384	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
2641920	"Library/Artifacts/91/91ca041887a797282b5e0254d5b4f81c"
Using 1 threads for dedupe phase
[0x507000008ec0] (01/47) Try to dedupe extents with id 07c4ed82
[0x507000008ec0] Skipping - extents are already deduped.
[0x507000008ec0] (02/47) Try to dedupe extents with id 0b94f0ee
[0x507000008ec0] Skipping - extents are already deduped.
[0x507000008ec0] (03/47) Try to dedupe extents with id 54f9a447
[0x507000008ec0] Skipping - extents are already deduped.
[0x507000008ec0] (04/47) Try to dedupe extents with id 5fb7ffff
[0x507000008ec0] Skipping - extents are already deduped.
[0x507000008ec0] (05/47) Try to dedupe extents with id 780f19d4
[0x507000008ec0] Skipping - extents are already deduped.
[0x507000008ec0] (06/47) Try to dedupe extents with id 8045c799
[0x507000008ec0] Skipping - extents are already deduped.
[0x507000008ec0] (07/47) Try to dedupe extents with id ba89c648
[0x507000008ec0] Skipping - extents are already deduped.
[0x507000008ec0] (08/47) Try to dedupe extents with id 04ff94cc
[0x507000008ec0] Skipping - extents are already deduped.
[0x507000008ec0] (09/47) Try to dedupe extents with id 0792d128
[0x507000008ec0] Skipping - extents are already deduped.
[0x507000008ec0] (10/47) Try to dedupe extents with id 376be89e
[0x507000008ec0] Skipping - extents are already deduped.
ERROR: run_dedupe.c:287
[stack trace follows]
/usr/lib/libasan.so.8(___interceptor_backtrace+0xa4) [0x7ffff794ca14]
/usr/bin/duperemove(+0x41b54) [0x555555595b54]
/usr/bin/duperemove(+0x3dcda) [0x555555591cda]
/usr/bin/duperemove(+0x3f140) [0x555555593140]
/usr/bin/duperemove(+0x3f4c4) [0x5555555934c4]
/usr/lib/libglib-2.0.so.0(+0x95ca3) [0x7ffff77e9ca3]
/usr/lib/libglib-2.0.so.0(+0x92be6) [0x7ffff77e6be6]
/usr/lib/libasan.so.8(+0x5d10a) [0x7ffff790110a]
/usr/lib/libc.so.6(+0x942ce) [0x7ffff73c32ce]
/usr/lib/libc.so.6(+0x11929c) [0x7ffff744829c]

Thread 4 "pool" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff1beb6c0 (LWP 221301)]
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44	    return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007ffff73c53a3 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78
#2  0x00007ffff736c120 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007ffff73534c3 in __GI_abort () at abort.c:79
#4  0x0000555555591ce6 in dedupe_extent_list (dext=0x5080000208a0, fiemap_bytes=0x7ffff0ab4ca0, kern_bytes=0x7ffff0ab4cc0, passno=11) at run_dedupe.c:287
#5  0x0000555555593140 in extent_dedupe_worker (dext=0x5080000208a0, fiemap_bytes=0x7ffff0ab4ca0, kern_bytes=0x7ffff0ab4cc0) at run_dedupe.c:494
#6  0x00005555555934c4 in dedupe_worker (priv=0x5080000208a0, counts=0x7ffff50003c0) at run_dedupe.c:537
#7  0x00007ffff77e9ca3 in g_thread_pool_thread_proxy (data=<optimized out>) at ../glib/glib/gthreadpool.c:336
#8  0x00007ffff77e6be6 in g_thread_proxy (data=data@entry=0x507000008ec0) at ../glib/glib/gthread.c:892
#9  0x00007ffff790110a in asan_thread_start (arg=0x7ffff6b33000) at /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_interceptors.cpp:234
#10 0x00007ffff73c32ce in start_thread (arg=<optimized out>) at pthread_create.c:447
#11 0x00007ffff744829c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
(gdb) p *dext
$2 = {de_num_dupes = 1, de_len = 8192, de_hash = "X\002m\226\t4\242D\256\026\277\227\2318\313p", de_score = 8192, de_extents = {next = 0x508000020838, prev = 0x508000020838}, de_extents_root = {rb_node = 0x508000020848}, de_node = {__rb_parent_color = 88510686174945, 
    rb_right = 0x0, rb_left = 0x0}, de_mutex = {p = 0x0, i = {0, 0}}} 

@ticpu
Copy link
Contributor Author

ticpu commented Feb 3, 2025

Oh... that might be #341

Add safety check in push_extents() to prevent queuing work items
with less than 2 duplicates. This prevents assertion failures in
dedupe_extent_list() when the duplicate count drops below 2 between
the initial duplicate detection and worker thread processing.

Fixes: markfasheh#341
@JackSlateur
Copy link
Collaborator

Thank you for your contribution

@JackSlateur JackSlateur merged commit e5cfa34 into markfasheh:master Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants