Skip to content

erasure-code: Set reed_sol_van to be the default technique for Jerasure if none is specified.#61674

Merged
NitzanMordhai merged 1 commit intoceph:mainfrom
connorfawcett:wip-ec-default-fixes
Feb 26, 2025
Merged

erasure-code: Set reed_sol_van to be the default technique for Jerasure if none is specified.#61674
NitzanMordhai merged 1 commit intoceph:mainfrom
connorfawcett:wip-ec-default-fixes

Conversation

@connorfawcett
Copy link
Contributor

This PR will set the default technique for Jerasure to be reed_sol_van if none is specified as per the documentation. This used to work for jerasure because the profile would inherit the options from the osd_pool_default_erasure_code_profile conf file setting which set the default plugin to jerasure, which meant that the erasure code profile was successfully created. When the conf file setting is changed to isal this stops working because it is invalid to create a jerasure profile without specifying a technique.

This PR also makes a small change to the erasure_code_plugin_exists function, this issue highlighted the function does not properly cases where the plugin exists but the command fails for some other reason e.g. missing technique parameter. I have added a special case for this, although it should no longer be exercised now the ErasureCodePluginJerasure.cc change.

Fixes: https://tracker.ceph.com/issues/69758
Fixes: https://tracker.ceph.com/issues/69762

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@connorfawcett connorfawcett requested a review from a team as a code owner February 6, 2025 15:05
@connorfawcett
Copy link
Contributor Author

connorfawcett commented Feb 10, 2025

Testing for ErasureCodePluginJerasure.cc:

ceph osd erasure-code-profile set TESTPROFILE plugin=jerasure technique= k= m=
Succeeds with no error

Also tested the ceph-helpers.sh function with a erasure-code-profile set technique= call and observed the following:

erasure_code_plugin_exists isa
Returns 0 and prints 'Error ENOENT: technique= is not a valid coding technique. Choose one of the following: reed_sol_van,cauchy'

erasure_code_plugin_exists jerasure
Returns 0, no error printed due to ErasureCodePluginJerasure.cc change

erasure_code_plugin_exists jbob
Returns 1

@bill-scales
Copy link
Contributor

Changes look good, but this unit test is failing because we have made the code match the documentation and default to reed_sol_van!

src/test/erasure-code/TestErasureCodePluginJerasure.cc

TEST(ErasureCodePlugin, factory)
{
  ErasureCodePluginRegistry &instance = ErasureCodePluginRegistry::instance();
  ErasureCodeProfile profile;
  {
    ErasureCodeInterfaceRef erasure_code;
    EXPECT_FALSE(erasure_code);
    EXPECT_EQ(-ENOENT, instance.factory("jerasure",
                                        g_conf().get_val<std::string>("erasure_code_dir"),
                                        profile,
                                        &erasure_code, &cerr));
    EXPECT_FALSE(erasure_code);
  }

We should fix the test so it provides an invalid technique for this first test adding something like profile["technique"] = "doesnotexist"; before calling factory. Should also probably move ErasureCodeProfile profile; inside the {

…re if none is specified.

Also make the erasure_code_plugin_exists function in ceph-helpers.sh differentiate
between a non-existent plugin and an initialization failure.

Fixes: https://tracker.ceph.com/issues/69758
Fixes: https://tracker.ceph.com/issues/69762

Signed-off-by: Connor Fawcett <connorfa@uk.ibm.com>
@connorfawcett
Copy link
Contributor Author

connorfawcett commented Feb 11, 2025

Hopefully will see green now.

[root@ceph build]# bin/unittest_erasure_code_plugin_jerasure
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from ErasureCodePlugin
[ RUN ] ErasureCodePlugin.factory
load: jerasure technique=doesnotexist is not a valid coding technique. Choose one of the following: reed_sol_van, reed_sol_r6_op, cauchy_orig, cauchy_good, liberation, blaum_roth, liber8tion[ OK ] ErasureCodePlugin.factory (6 ms)
[----------] 1 test from ErasureCodePlugin (6 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (6 ms total)
[ PASSED ] 1 test.

@ljflores
Copy link
Member

jenkins retest this please

@connorfawcett
Copy link
Contributor Author

jenkins test windows

@connorfawcett
Copy link
Contributor Author

@ljflores all the checks have passed on this one but I saw you've added the needs-qa label, shall I hold off on merging it?

@ljflores
Copy link
Member

ljflores commented Feb 17, 2025

Hey @connorfawcett yeah, the "needs-qa" tag means that it is in the queue to go into integration testing. We'll add it in the next batch.

I added you to the Slack channel where integration testing is coordinated. We often do integration testing in batches to conserve resources. However, you are also welcome to test your PR yourself. The main thing is that the PR should pass integration testing before being merged, whether it goes through the batched pipeline or your own.

Here's the doc page that summarizes how to schedule integration tests if you're curious: https://docs.ceph.com/en/latest/dev/developer_guide/testing_integration_tests/tests-integration-testing-teuthology-intro/

And here's a page describing the "batched" process and how we review/approve the results prior to merging: https://tracker.ceph.com/projects/rados/wiki/QA_ANALYSIS

@NitzanMordhai
Copy link
Contributor

@NitzanMordhai NitzanMordhai merged commit 8066068 into ceph:main Feb 26, 2025
13 checks passed
@connorfawcett connorfawcett deleted the wip-ec-default-fixes branch February 27, 2025 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants