Skip to content

qa/cephfs: lua to respect missing kernel in yaml#66092

Merged
kshtsk merged 1 commit intoceph:mainfrom
kshtsk:wip-fix-lua-kernel-check
Feb 10, 2026
Merged

qa/cephfs: lua to respect missing kernel in yaml#66092
kshtsk merged 1 commit intoceph:mainfrom
kshtsk:wip-fix-lua-kernel-check

Conversation

@kshtsk
Copy link
Contributor

@kshtsk kshtsk commented Oct 30, 2025

When teuthology-suite is called with '-k none' option, which is valid, there is no kernel record in job config created.

However at some test cases the lua premerge dies with exception:
KeyError: 'kernel'

and teuthology-suite quits unexpectedly without scheduling any jobs.

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

@github-actions github-actions bot added the tests label Oct 30, 2025
@kshtsk kshtsk force-pushed the wip-fix-lua-kernel-check branch from 3f159c9 to acfb74a Compare October 30, 2025 12:06
@kshtsk kshtsk requested a review from batrick October 30, 2025 12:07
py_attrgetter(yaml.kernel).pop('koji_task', nil)
py_attrgetter(yaml.kernel).pop('rpm', nil)
py_attrgetter(yaml.kernel).pop('sha1', nil)
py_attrgetter(yaml.kernel).pop('tag', nil)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@batrick Wonder if the kernel itself should be popped out from yaml too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I didn't do that because it broke the teuthology task, IIRC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@batrick Wonder if the kernel itself should be popped out from yaml too?

Circling back: no. The reason is because the kernel config at the time this postmerge script runs will look something like (without -k):

with k-testing:

kernel:
  branch: distro
  ...
  client:
    branch: testing

or with k-stock:

kernel:
  branch: distro
  ...
  client:
    branch: distro

The point of this postmerge script is to avoid installing/updating the kernel on non-client nodes. It took me a while to remember this so please update the comment so it's clear in the future.

@kshtsk
Copy link
Contributor Author

kshtsk commented Oct 30, 2025

jenkins test docs

1 similar comment
@kshtsk
Copy link
Contributor Author

kshtsk commented Oct 30, 2025

jenkins test docs

@kshtsk
Copy link
Contributor Author

kshtsk commented Oct 30, 2025

the docs fails because of https://tracker.ceph.com/issues/73645

@kshtsk kshtsk requested a review from zmc October 30, 2025 22:59
Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When teuthology-suite is called with '-k none' option, which is valid, there is no kernel record in job config created.

... why is -k none valid? Should it not be "default" (unset) or some real branch?

py_attrgetter(yaml.kernel).pop('koji_task', nil)
py_attrgetter(yaml.kernel).pop('rpm', nil)
py_attrgetter(yaml.kernel).pop('sha1', nil)
py_attrgetter(yaml.kernel).pop('tag', nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I didn't do that because it broke the teuthology task, IIRC.

@kshtsk
Copy link
Contributor Author

kshtsk commented Oct 31, 2025

When teuthology-suite is called with '-k none' option, which is valid, there is no kernel record in job config created.

... why is -k none valid? Should it not be "default" (unset) or some real branch?

because by default -k is set to distro. So the none value is used to disable kernel task which is useless when used with stock cloud images, or the right version of kernel is preinstalled.

@batrick
Copy link
Member

batrick commented Oct 31, 2025

When teuthology-suite is called with '-k none' option, which is valid, there is no kernel record in job config created.

... why is -k none valid? Should it not be "default" (unset) or some real branch?

because by default -k is set to distro. So the none value is used to disable kernel task which is useless when used with stock cloud images, or the right version of kernel is preinstalled.

That's useful and in fact what this fragment does.

@kshtsk
Copy link
Contributor Author

kshtsk commented Nov 1, 2025

When teuthology-suite is called with '-k none' option, which is valid, there is no kernel record in job config created.

... why is -k none valid? Should it not be "default" (unset) or some real branch?

because by default -k is set to distro. So the none value is used to disable kernel task which is useless when used with stock cloud images, or the right version of kernel is preinstalled.

That's useful and in fact what this fragment does.

I meant, the kernel task is useless in that case, but disabling of it is useful of course :-D

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment that the new check is specifically for -k none.

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other premerge/postmerge scripts that need to be updated to be tolerant of yaml.kernel == nil.

# f() { teuthology-suite -t main --machine-type smithi --email pdonnell@redhat.com --suite fs -p 75 --force-priority --subset $((RANDOM % 1024))/1024 --ceph wip-pdonnell-testing-20251126.180742-debug --filter-out upgrade,ubuntu --filter k-testing -k none --limit 1 --verbose  ; } ; which teuthology-suite && ((sleep 0h && f) || (sleep 1h && f) || (sleep 1h && f) || (sleep 1h && f) || (sleep 90m && f))
...
2025-12-05 18:19:05,050.050 DEBUG:teuthology.suite.merge:skipping config fs/upgrade/mds_upgrade_sequence/{bluestore-bitmap conf/{client mds mgr mon osd} distro fail_fs/yes kernel overrides/{ignorelist_health ignorelist_upgrade ignorelist_wrongly_marked_down pg-warn pg_health syntax} roles tasks/{0-from/squid 1-volume/{0-create 1-ranks/2 2-allow_standby_replay/no 3-inline/yes 4-verify} 2-client/kclient 3-upgrade-mgr-staggered 4-config-upgrade/{fail_fs} 5-upgrade-with-workload 6-verify}} due to postmerge filter
2025-12-05 18:19:05,050.050 DEBUG:teuthology.suite.merge:merging config workload/{0-distro begin/{0-install 1-cephadm 2-logrotate 3-kernel} clusters/1a11s-mds-1c-client-3node conf/{client mds mgr mon osd} mount/kclient/{base/{mount-syntax/{v1} mount overrides/{distro/testing/k-testing ms-die-on-skipped}} ms_mode/legacy wsync/no} objectstore-ec/bluestore-comp-ec-root omap_limit/10 overrides/{cephsqlite-timeout frag ignorelist_health ignorelist_wrongly_marked_down osd-asserts pg_health session_timeout} ranks/multi/{balancer/automatic export-check n/3 replication/default} standby-replay tasks/{0-fscrypt/{with-fscrypt-dummy} 0-subvolume/{with-namespace-isolated} 0-use-snaprealm/{subvol} 1-check-counter 2-scrub/yes 3-snaps/yes 4-flush/no 5-quiesce/no 6-workunit/suites/pjd}}
2025-12-05 18:19:05,052.052 DEBUG:teuthology.suite.merge:premerge script running:
log.debug("base kernel %s", base_config.kernel)
local kernel = base_config.kernel
if kernel.branch ~= "distro" then
  log.debug("overriding testing kernel with %s", kernel)
  yaml_fragment.kernel.client = kernel
end

2025-12-05 18:19:05,052.052 DEBUG:teuthology.suite.merge:base kernel None
Traceback (most recent call last):
  File "/cephfs/home/pdonnell/teuthology/virtualenv/bin/teuthology-suite", line 7, in <module>
    sys.exit(main())
  File "/cephfs/home/pdonnell/teuthology/scripts/suite.py", line 232, in main
    return teuthology.suite.main(args)
  File "/cephfs/home/pdonnell/teuthology/teuthology/suite/__init__.py", line 140, in main
    run.prepare_and_schedule()
  File "/cephfs/home/pdonnell/teuthology/teuthology/suite/run.py", line 467, in prepare_and_schedule
    num_jobs = self.schedule_suite()
  File "/cephfs/home/pdonnell/teuthology/teuthology/suite/run.py", line 636, in schedule_suite
    configs = list(config_merge(configs, **config_merge_kwargs))
  File "/cephfs/home/pdonnell/teuthology/teuthology/suite/merge.py", line 157, in config_merge
    if not script():
  File "lupa/lua54.pyx", line 946, in lupa.lua54._LuaObject.__call__
  File "lupa/lua54.pyx", line 1918, in lupa.lua54.call_lua
  File "lupa/lua54.pyx", line 1945, in lupa.lua54.execute_lua_call
  File "lupa/lua54.pyx", line 1826, in lupa.lua54.raise_lua_error
lupa.lua54.LuaError: [string "teuthology"]:3: attempt to index a nil value (local 'kernel')
stack traceback:
        [C]: in ?
^C

@kshtsk
Copy link
Contributor Author

kshtsk commented Dec 5, 2025

There are other premerge/postmerge scripts that need to be updated to be tolerant of yaml.kernel == nil.

# f() { teuthology-suite -t main --machine-type smithi --email pdonnell@redhat.com --suite fs -p 75 --force-priority --subset $((RANDOM % 1024))/1024 --ceph wip-pdonnell-testing-20251126.180742-debug --filter-out upgrade,ubuntu --filter k-testing -k none --limit 1 --verbose  ; } ; which teuthology-suite && ((sleep 0h && f) || (sleep 1h && f) || (sleep 1h && f) || (sleep 1h && f) || (sleep 90m && f))
...
2025-12-05 18:19:05,050.050 DEBUG:teuthology.suite.merge:skipping config fs/upgrade/mds_upgrade_sequence/{bluestore-bitmap conf/{client mds mgr mon osd} distro fail_fs/yes kernel overrides/{ignorelist_health ignorelist_upgrade ignorelist_wrongly_marked_down pg-warn pg_health syntax} roles tasks/{0-from/squid 1-volume/{0-create 1-ranks/2 2-allow_standby_replay/no 3-inline/yes 4-verify} 2-client/kclient 3-upgrade-mgr-staggered 4-config-upgrade/{fail_fs} 5-upgrade-with-workload 6-verify}} due to postmerge filter
2025-12-05 18:19:05,050.050 DEBUG:teuthology.suite.merge:merging config workload/{0-distro begin/{0-install 1-cephadm 2-logrotate 3-kernel} clusters/1a11s-mds-1c-client-3node conf/{client mds mgr mon osd} mount/kclient/{base/{mount-syntax/{v1} mount overrides/{distro/testing/k-testing ms-die-on-skipped}} ms_mode/legacy wsync/no} objectstore-ec/bluestore-comp-ec-root omap_limit/10 overrides/{cephsqlite-timeout frag ignorelist_health ignorelist_wrongly_marked_down osd-asserts pg_health session_timeout} ranks/multi/{balancer/automatic export-check n/3 replication/default} standby-replay tasks/{0-fscrypt/{with-fscrypt-dummy} 0-subvolume/{with-namespace-isolated} 0-use-snaprealm/{subvol} 1-check-counter 2-scrub/yes 3-snaps/yes 4-flush/no 5-quiesce/no 6-workunit/suites/pjd}}
2025-12-05 18:19:05,052.052 DEBUG:teuthology.suite.merge:premerge script running:
log.debug("base kernel %s", base_config.kernel)
local kernel = base_config.kernel
if kernel.branch ~= "distro" then
  log.debug("overriding testing kernel with %s", kernel)
  yaml_fragment.kernel.client = kernel
end

2025-12-05 18:19:05,052.052 DEBUG:teuthology.suite.merge:base kernel None
Traceback (most recent call last):
  File "/cephfs/home/pdonnell/teuthology/virtualenv/bin/teuthology-suite", line 7, in <module>
    sys.exit(main())
  File "/cephfs/home/pdonnell/teuthology/scripts/suite.py", line 232, in main
    return teuthology.suite.main(args)
  File "/cephfs/home/pdonnell/teuthology/teuthology/suite/__init__.py", line 140, in main
    run.prepare_and_schedule()
  File "/cephfs/home/pdonnell/teuthology/teuthology/suite/run.py", line 467, in prepare_and_schedule
    num_jobs = self.schedule_suite()
  File "/cephfs/home/pdonnell/teuthology/teuthology/suite/run.py", line 636, in schedule_suite
    configs = list(config_merge(configs, **config_merge_kwargs))
  File "/cephfs/home/pdonnell/teuthology/teuthology/suite/merge.py", line 157, in config_merge
    if not script():
  File "lupa/lua54.pyx", line 946, in lupa.lua54._LuaObject.__call__
  File "lupa/lua54.pyx", line 1918, in lupa.lua54.call_lua
  File "lupa/lua54.pyx", line 1945, in lupa.lua54.execute_lua_call
  File "lupa/lua54.pyx", line 1826, in lupa.lua54.raise_lua_error
lupa.lua54.LuaError: [string "teuthology"]:3: attempt to index a nil value (local 'kernel')
stack traceback:
        [C]: in ?
^C

It is just a link to the same file, isn't it?

% ls -la fs/workload/begin/3-kernel.yaml 
lrwxr-xr-x  1 kyr  staff  30 Oct 20 14:10 fs/workload/begin/3-kernel.yaml -> .qa/cephfs/begin/3-kernel.yaml

@kshtsk
Copy link
Contributor Author

kshtsk commented Dec 5, 2025

Ah... it ./cephfs/mount/kclient/overrides/distro/testing/k-testing.yaml

@kshtsk
Copy link
Contributor Author

kshtsk commented Dec 5, 2025

@batrick would the following logic be correct in this case:

  premerge: |
    log.debug("base kernel %s", base_config.kernel)
    local kernel = base_config.kernel
    if kernel == nil then
      yaml_fragment.kernel = nil
    elseif kernel.branch ~= "distro" then
      log.debug("overriding testing kernel with %s", kernel)
      yaml_fragment.kernel.client = kernel
    end

@batrick
Copy link
Member

batrick commented Dec 6, 2025

@batrick would the following logic be correct in this case:

  premerge: |
    log.debug("base kernel %s", base_config.kernel)
    local kernel = base_config.kernel
    if kernel == nil then
      yaml_fragment.kernel = nil
    elseif kernel.branch ~= "distro" then
      log.debug("overriding testing kernel with %s", kernel)
      yaml_fragment.kernel.client = kernel
    end

Yes, I think so. Please ensure it can schedule before requesting a new review.

@kshtsk
Copy link
Contributor Author

kshtsk commented Dec 9, 2025

Tried this command:
teuthology-suite -v -t main --machine-type smithi --email kyr@clyso.com --suite fs -p 75 --force-priority --ceph main --suite-repo https://github.com/kshtsk/ceph --suite-branch wip-fix-lua-kernel-check --sha1 7085921d5477a7797f4912c5cf10d5046fb10d9e --filter-out upgrade,ubuntu --filter k-testing -k none --subset 1/1024 --seed 777 --limit 2
The run scheduled is https://pulpito.ceph.com/kyr-2025-12-09_13:47:36-fs-main-none-default-smithi/

And:
teuthology-suite -v -t main --machine-type smithi --email kyr@clyso.com --suite fs -p 75 --force-priority --ceph main --suite-repo https://github.com/kshtsk/ceph --suite-branch wip-fix-lua-kernel-check --sha1 7085921d5477a7797f4912c5cf10d5046fb10d9e --filter-out upgrade,ubuntu --filter k-testing --subset 1/1024 --seed 777 --limit 2
with run https://pulpito.ceph.com/kyr-2025-12-09_13:49:27-fs-main-distro-default-smithi/.

@kshtsk
Copy link
Contributor Author

kshtsk commented Jan 13, 2026

@batrick Can we merge this already or we still need some special testing?

@batrick
Copy link
Member

batrick commented Jan 13, 2026

@batrick Can we merge this already or we still need some special testing?

Needs qa'd by CephFS. cc @vshankar @rishabh-d-dave @mchangir

@vshankar
Copy link
Contributor

@batrick Can we merge this already or we still need some special testing?

I would at least like to see if the test jobs are chosen as expected.

@kshtsk
Copy link
Contributor Author

kshtsk commented Jan 14, 2026

@batrick Can we merge this already or we still need some special testing?

I would at least like to see if the test jobs are chosen as expected.

How can I help?

@vshankar
Copy link
Contributor

@batrick Can we merge this already or we still need some special testing?

I would at least like to see if the test jobs are chosen as expected.

How can I help?

I will get to testing this next week, but if you are willing to expedite, then scheduling a fs suite run (even with `--dry-run) and checking if the kclient jobs are picked up would suffice.

@kshtsk
Copy link
Contributor Author

kshtsk commented Jan 19, 2026

Okay, seems like running with -k none against main fs:thrash

Gives another KeyError:

  File "lupa/lua54.pyx", line 921, in lupa.lua54._LuaObject.__call__
  File "lupa/lua54.pyx", line 1886, in lupa.lua54.call_lua
  File "lupa/lua54.pyx", line 1912, in lupa.lua54.execute_lua_call
  File "lupa/lua54.pyx", line 391, in lupa.lua54.LuaRuntime.reraise_on_exception
  File "lupa/lua54.pyx", line 2220, in lupa.lua54.py_object_getindex_with_gil
  File "lupa/lua54.pyx", line 2182, in lupa.lua54.getitem_for_lua
KeyError: 'branch'

Happens because of suite overrides kernel config over here:
https://github.com/ceph/ceph/blob/main/qa/cephfs/mount/kclient/overrides/distro/stock/k-stock.yaml

kernel:
  client:
    sha1: distro
ktype: distro

and similar:
https://github.com/ceph/ceph/blob/main/qa/cephfs/mount/kclient/overrides/distro/testing/k-testing.yaml

I can add another patch:

--- a/qa/cephfs/begin/3-kernel.yaml
+++ b/qa/cephfs/begin/3-kernel.yaml
@@ -12,7 +12,7 @@ teuthology:
     - |
       local kernel = py_attrgetter(yaml).get('kernel')
       if kernel ~= nil then
-        local branch = yaml.kernel.branch
+        local branch = py_attrgetter(yaml.kernel).get('branch')
         if branch and not yaml.kernel.branch:find "-all$" then
           log.debug("removing default kernel specification: %s", yaml.kernel)
           py_attrgetter(yaml.kernel).pop('branch', nil)

But I wonder do we need to keep kernel.client.sha1, I am not sure that we need to deploy specific kernel when we query -k none, but in this case, we need to check kernel.branch for presence when we add kernel task.
@batrick what do you think?

@kshtsk kshtsk force-pushed the wip-fix-lua-kernel-check branch from 7bf01d6 to cc2143d Compare January 19, 2026 17:16
@kshtsk
Copy link
Contributor Author

kshtsk commented Jan 20, 2026

@vshankar The run scheduled by teuthology crontabjob https://pulpito.ceph.com/teuthology-2025-12-01_20:24:04-fs-main-distro-default-smithi/ has 345 jobs.
If I use seed and subset from this run the --dry-run produces same amount of jobs:

teuthology-suite -v --suite fs --suite-repo https://github.com/kshtsk/ceph --suite-branch wip-fix-lua-kernel-check --arch x86_64 --subset 212/512 --seed 5746 -k none --ceph main --dry-run
.
.
.
2026-01-20 14:28:55,528.528 INFO:teuthology.suite.run:15/360 jobs were filtered out.
2026-01-20 14:28:55,528.528 INFO:teuthology.suite.run:Scheduled 345 jobs in total.

Same result without providing -k none option.

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, seems like running with -k none against main fs:thrash

Gives another KeyError:

  File "lupa/lua54.pyx", line 921, in lupa.lua54._LuaObject.__call__
  File "lupa/lua54.pyx", line 1886, in lupa.lua54.call_lua
  File "lupa/lua54.pyx", line 1912, in lupa.lua54.execute_lua_call
  File "lupa/lua54.pyx", line 391, in lupa.lua54.LuaRuntime.reraise_on_exception
  File "lupa/lua54.pyx", line 2220, in lupa.lua54.py_object_getindex_with_gil
  File "lupa/lua54.pyx", line 2182, in lupa.lua54.getitem_for_lua
KeyError: 'branch'

Happens because of suite overrides kernel config over here: https://github.com/ceph/ceph/blob/main/qa/cephfs/mount/kclient/overrides/distro/stock/k-stock.yaml

kernel:
  client:
    sha1: distro
ktype: distro

and similar: https://github.com/ceph/ceph/blob/main/qa/cephfs/mount/kclient/overrides/distro/testing/k-testing.yaml

I can add another patch:

--- a/qa/cephfs/begin/3-kernel.yaml
+++ b/qa/cephfs/begin/3-kernel.yaml
@@ -12,7 +12,7 @@ teuthology:
     - |
       local kernel = py_attrgetter(yaml).get('kernel')
       if kernel ~= nil then
-        local branch = yaml.kernel.branch
+        local branch = py_attrgetter(yaml.kernel).get('branch')
         if branch and not yaml.kernel.branch:find "-all$" then
           log.debug("removing default kernel specification: %s", yaml.kernel)
           py_attrgetter(yaml.kernel).pop('branch', nil)

I think the right solution is to update the k-stock and k-testing fragments to also set the branch to distro.

But I wonder do we need to keep kernel.client.sha1, I am not sure that we need to deploy specific kernel when we query -k none, but in this case, we need to check kernel.branch for presence when we add kernel task. @batrick what do you think?

Even if -k none is specified, the k-testing and k-stock yamls should set the kernel for client nodes.

if kernel.branch ~= "distro" then
if kernel == nil then
yaml_fragment.kernel = nil
elseif kernel.branch ~= "distro" then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe this is correct. This code exists to allow -k wip-kernel-branch to override the default of testing for this YAML fragment. If the scheduler specifies -k none, we still want to install the testing kernel for this fragment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, how is this supposed to work, usually, -k none is provided when there is no corresponding kernel build to avoid kernel task failing. Shouldn't we exclude this test case from the schedule list?

Copy link
Contributor Author

@kshtsk kshtsk Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyways, I just use this condition:

if kernel ~= nil and kernel.branch ~= "distro" then
.
.
.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, how is this supposed to work, usually, -k none is provided when there is no corresponding kernel build...

I don't know of a real situation where that would be true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may happen when a suite is scheduled against a build repo built outside of sepia lab.

py_attrgetter(yaml.kernel).pop('koji_task', nil)
py_attrgetter(yaml.kernel).pop('rpm', nil)
py_attrgetter(yaml.kernel).pop('sha1', nil)
py_attrgetter(yaml.kernel).pop('tag', nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@batrick Wonder if the kernel itself should be popped out from yaml too?

Circling back: no. The reason is because the kernel config at the time this postmerge script runs will look something like (without -k):

with k-testing:

kernel:
  branch: distro
  ...
  client:
    branch: testing

or with k-stock:

kernel:
  branch: distro
  ...
  client:
    branch: distro

The point of this postmerge script is to avoid installing/updating the kernel on non-client nodes. It took me a while to remember this so please update the comment so it's clear in the future.

@kshtsk kshtsk force-pushed the wip-fix-lua-kernel-check branch from cc2143d to 2354fbd Compare January 27, 2026 21:45
When teuthology-suite is called with '-k none' option, which is valid,
there is no kernel record in job config created.

However at some test cases the lua premerge dies with exception:
  KeyError: 'kernel'
and when branch is not set for '-k none' and kernel client is
overridden:
  KeyError: 'branch'
so teuthology-suite quits unexpectedly without scheduling any jobs.

Fixes: https://tracker.ceph.com/issues/73676
Signed-off-by: Kyr Shatskyy <kyrylo.shatskyy@clyso.com>
@kshtsk kshtsk force-pushed the wip-fix-lua-kernel-check branch from 2354fbd to b410701 Compare February 2, 2026 12:47
@kshtsk
Copy link
Contributor Author

kshtsk commented Feb 2, 2026

Hey @batrick could you please take another look, thanks.

@kshtsk
Copy link
Contributor Author

kshtsk commented Feb 10, 2026

@vshankar are we good to merge it? I'd love to move forward with backports

@vshankar
Copy link
Contributor

@vshankar are we good to merge it? I'd love to move forward with backports

I'm fine with merging this. Please feel free to merge it or I will merge it by EOD.

@kshtsk kshtsk merged commit af57cb7 into ceph:main Feb 10, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants