erasure-code: restore jerasure BlaumRoth default w#2556
erasure-code: restore jerasure BlaumRoth default w#25561 commit merged intogiantfrom unknown repository
Conversation
Changing from W=7 to W=6 by default for the BlaumRoth technique is correct but introduces a regression. The content that was encoded with the previous version cannot be read again. Although the prime(w+1) constraint was not obeyed by W=7, the encoded content was useable and should keep being readable. The W=7 remains the default for backward compatibility and an exception to the prime(w+1) check. http://tracker.ceph.com/issues/9572 Fixes: #9572 Signed-off-by: Loic Dachary <loic-201408@dachary.org>
|
@apeters1971 I overlooked backward compatibility in the previous fix enforcing BlaumRoth constraints. Does this backward compatibility fix look ok to you ? |
|
Hi Loic, so you are sure that en-coding and de-coding is working with w=7 ... what was the Jerasure library doing in this case? Using another matrix? |
|
@apeters1971 I'm sure it works: content is encoded and decoded as one would expect. I'm not sure what the prime(w+1) constraint enforces but it does cause this problem. I'm also tempted to not overthink this one beyond the default because it is extremely unlikely that anyone is using it for real. And I would be amazed if anyone does use it with a custom w. Do you think I'm too lazy ? :-) |
|
@apeters1971 Sorry to insist : this is blocking http://tracker.ceph.com/issues/9420 but I would not want to commit something that you don't feel comfortable with. Thanks for your understanding :-) |
|
Hi Loic, I agree that I imagine nobody ever used that and you can certainly keep it like that for backward compatibility. But if there is a restriction on w+1 to be prime the reason for that is probably that it does not work if it is not prime. So you shouldn't have this as a default! |
|
@apeters1971 apparently it works for w=7 despite the fact that w=8 is not prime. I encoded / decoded random content, including recovering from erasure, successfully with w=7. But maybe I was just lucky and other content would have failed. In which case we should declare this a bug and add something in the release notes to say that all blaumroth encoded content is potentialy corrupted. It's worth checking if that is a possibility though. |
for w in 7 11 13 17 19 ; do for k in $(seq 2 $w) ; do for m in $(seq 1 $k) ; do for erasures in $(seq 1 $m) ; do ./ceph_erasure_code_benchmark --plugin jerasure --workload decoded --iterations 1 --size 4096 --erasures $erasures --parameter w=$w --parameter k=$k --parameter m=2 --parameter technique=blaum_roth ; done ; done ; done ; done all check out. Whatever consequence the unmatched constraint, it does not cause an encoding/decoding problem. |
|
Hi Loic, Cheers Andreas. From: Loic Dachary [notifications@github.com] for w in 7 11 13 17 19 ; do for k in $(seq 2 all check out. Whatever consequence the unmatched constraint, it does not cause an encoding/decoding problem. — |
|
@apeters1971 you are correct, my mistake. I'll change the benchmark tool to allow non random exploration, that will be convenient in the future. |
|
for w in 7 11 13 17 19 ; do for k in $(seq 2 claims all is well, using the --erasures-generation exhaustive implemented at https://github.com/dachary/ceph/commit/648b7bccc2cab91e7b12889ad60133263f118a82 |
|
It seems dangerous to leave the default as something that is not "supposed" to work. I would rather change the default, break compatibility, and put in an upgrade note about it. If we find that someone is using 7, can we simply ask them to put w=7 in their ec profile or something to make their pool continue to function? |
|
Ok. If Kevin Greenan finds out that using w=7 is harmless, we can leave it. If we're not sure it's probably better to add a note. The probability that someone is using this technique is extremely low anyway. |
|
@apeters1971 I've added a verbose output that shows the recursive implementation to retrieve all combinations of the erasure actually works, as shown in http://tracker.ceph.com/issues/9572#note-4 @liewegas it turns out that w=7 is valid for all combinations of k. This is not proven in theory but a brute force exploration of all erasure scenario proves that it actually works. My understanding is that it is proven to work in theory for all w where w+1 is prime. But it does not mean that other values of w do not work, only that the proof that they work must be made via brute force exploration instead of a mathematical proof. |
|
@liewegas I think it is safe to leave w=7 as an exception to the rule. |
|
SOunds okay to me! On Mon, 29 Sep 2014, Loic Dachary wrote:
|
erasure-code: restore jerasure BlaumRoth default w Reviewed-by: Sage Weil <sage@redhat.com>
Changing from W=7 to W=6 by default for the BlaumRoth technique is
correct but introduces a regression. The content that was encoded with
the previous version cannot be read again. Although the prime(w+1)
constraint was not obeyed by W=7, the encoded content was useable and
should keep being readable.
The W=7 remains the default for backward compatibility and an exception
to the prime(w+1) check.
http://tracker.ceph.com/issues/9572 Fixes: #9572
Signed-off-by: Loic Dachary loic-201408@dachary.org