Skip to content

Make custom remote control commands available in CLI#8489

Merged
thedrow merged 4 commits intocelery:mainfrom
tobinus:fix-cmd-cli-for-custom-cmds
Feb 28, 2024
Merged

Make custom remote control commands available in CLI#8489
thedrow merged 4 commits intocelery:mainfrom
tobinus:fix-cmd-cli-for-custom-cmds

Conversation

@tobinus
Copy link
Contributor

@tobinus tobinus commented Sep 7, 2023

Fixes #6608

Description

The celery inspect COMMAND and celery control COMMAND CLI subcommands let users run remote commands against any currently running worker. However, you were limited to running the remote control commands that were pre-defined with Celery. Any custom inspect or control commands specified by the user would be rejected.

The reason for this is the fact that the Click configuration is done at import-time. The user's custom inspect and control commands had yet to be registered by the time the top-level code in the celery/bin/control.py module was evaluated.

This pull request:

  • Moves the validation of the remote control command name from Click to our own code, at which point the user's custom remote control commands have been registered
  • Adds a --list option for seeing all available remote control commands of the relevant type, along with their signature and help summary
  • Adds unit tests for verifying that the inspect and control CLI subcommands accept custom remote control command names and the --list option

I have not been able to find a way to run the tests successfully on my computer, so I'll need to use the GitHub Actions workflow to help me test.

@tobinus
Copy link
Contributor Author

tobinus commented Sep 7, 2023

@auvipy Here's a draft pull request for fixing #6608, as per your comment there. (It may be a bit more finalized than "proof of concept", though, haha)

@Nusnus Nusnus self-requested a review September 7, 2023 15:01
@Nusnus
Copy link
Member

Nusnus commented Sep 7, 2023

@tobinus

I have not been able to find a way to run the tests successfully on my computer, so I'll need to use the GitHub Actions workflow to help me test.

What troubles did you have running the tests locally?

@tobinus
Copy link
Contributor Author

tobinus commented Sep 8, 2023

@Nusnus

What troubles did you have running the tests locally?

Thank you for asking! I couldn't get the Docker image to build, using Podman. It fails on this step due to OSError: [Errno 18] Invalid cross-device link: '/home/developer/.pyenv/versions/3.8.18/lib/python3.8/site-packages/pip/' -> '/home/developer/.pyenv/versions/3.8.18/lib/python3.8/site-packages/~ip'. As far as I can tell, this is a kernel bug or a Podman bug, so I'm stuck. I did find a bug report in Podman, but that should have been fixed already.

EDIT: I fixed the problem described below.

As for running them locally: I thought there was something wrong with my setup, since the tests just end up hanging at t/unit/bin/test_worker.py::test_cli. But they do that in GitHub Actions as well. I have no idea why that is, I haven't touched that file at all. The tests I wrote all pass when I run them alone, but some of them fail when running them alongside the other tests, so there seems to be some issues with state carrying over between tests.

@tobinus
Copy link
Contributor Author

tobinus commented Sep 8, 2023

The worker in t/unit/bin/test_worker::test_cli ended up ignoring the configured broker settings when the test_control.py had run beforehand. I think this is caused by celery modifying os.environ when it is given the command line option --broker memory://. I fixed this issue by resetting os.environ after the new tests.

Now on to the problem of why additional command line parameters are being sent when I run all the tests. It is possible to work around this problem by adding **kwargs as a parameter for control, just like inspect already has, but I'd like to understand the problem a bit better.

@tobinus
Copy link
Contributor Author

tobinus commented Sep 8, 2023

Ok, I see the problem. The t/unit/app/test_preload_cli.py test uses app.user_options['preload']. When it does, Celery adds those options to the function objects themselves:

    # User options
    worker.params.extend(ctx.obj.app.user_options.get('worker', []))
    beat.params.extend(ctx.obj.app.user_options.get('beat', []))
    events.params.extend(ctx.obj.app.user_options.get('events', []))

    for command in celery.commands.values():
        command.params.extend(ctx.obj.app.user_options.get('preload', []))

worker.params, beat.params, events.params and *.params are all global state, and are not reset between the tests. So the issue I'm seeing is two-fold:

  1. The preload test makes changes to global state that carry over to all later tests. I have an idea of how to fix that. I've fixed this.
  2. If the user specifies preload options, subcommands may end up not working if they're not equipped with **kwargs. So the test failure may showcase a real bug?

I could probably just add **kwargs to control's argument list, but it feels a bit fragile to require that all Click subcommands should accept **kwargs.

@tobinus
Copy link
Contributor Author

tobinus commented Sep 8, 2023

The tests should run now. However, if you define your own app.user_options['preload'], you won't be able to run celery control ... since it does not have **kwargs in its argument list. Sometime next week, I'll look into which solution is better:

  1. Modify celery.bin.base.handle_preload_options so that it doesn't pass preload options along to the decorated function, eliminating the need for **kwargs (except for the commands that accept user options outside of the preload options)
  2. Keep the requirement that all Celery subcommands must accept **kwargs, and add it to any subcommands currently missing it (like control)

I'm currently leaning towards solution 1.

@codecov
Copy link

codecov bot commented Sep 8, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.16% 🎉

Comparison is base (b6a5bdb) 87.44% compared to head (79266ae) 87.60%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8489      +/-   ##
==========================================
+ Coverage   87.44%   87.60%   +0.16%     
==========================================
  Files         148      148              
  Lines       18491    18522      +31     
  Branches     3156     3164       +8     
==========================================
+ Hits        16169    16226      +57     
+ Misses       2033     2005      -28     
- Partials      289      291       +2     
Flag Coverage Δ
unittests 87.57% <100.00%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
celery/bin/control.py 73.50% <100.00%> (+33.96%) ⬆️

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@auvipy auvipy self-requested a review September 9, 2023 07:08
Copy link
Member

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for handling this and refactoring the old tests. My primary concern here is if we are changing any existing behavior or functionality to implement the new one or fix anything, we need to be very careful about that. Might also need to consider splitting the relevant changes if needed. I will need some time to thoroughly check and review it.

@dejlek
Copy link
Contributor

dejlek commented Jan 17, 2024

This is superimportant for me as my custom inspect/control commands are not working since Celery moved to Click... :(

@auvipy
Copy link
Member

auvipy commented Jan 17, 2024

We are considering it for 5.4 release. Can you try and review it.

@dejlek
Copy link
Contributor

dejlek commented Jan 22, 2024

My primary concern here is if we are changing any existing behavior or functionality to implement the new one or fix anything, we need to be very careful about that.

I wish you did have this concern when decision was made to switch to Click...

@Nusnus
Copy link
Member

Nusnus commented Jan 22, 2024

My primary concern here is if we are changing any existing behavior or functionality to implement the new one or fix anything, we need to be very careful about that.

I wish you did have this concern when decision was made to switch to Click...

Cutting away the context does not remove it. He was referring to the changes of this PR, so it's a bit misleading my friend.

The change to Click had its winning arguments at the time, so the most useful attitude now is to just keep moving forward.

We do heavily consider meaningful changes but we can't please everyone at the same time so it's better to adjust the expectations instead.

@auvipy
Copy link
Member

auvipy commented Jan 23, 2024

My primary concern here is if we are changing any existing behavior or functionality to implement the new one or fix anything, we need to be very careful about that.

I wish you did have this concern when decision was made to switch to Click...

well, yes we have lot of issues in the implementation TBH, but design wise it was a good move. actually we need to work on making it parity with old features so. so i request having bit more patience and also if you please collaborate in that, that will be really helpful. but thanks for raising the issue.

@thedrow thedrow merged commit 582e169 into celery:main Feb 28, 2024
@fekete42
Copy link

fekete42 commented Mar 8, 2024

Hi,

I've faced this issue recently. I think I managed to find a relatively simple solution. I think the root cause of the problem is that options obtained from Panel.meta are extracted when the module celery.bin.control is loaded, not when it's used.

My idea would be to define a new type, which returns the proper custom control commands dynamically:

from click.types import ParamType

class PanelMetaChoiceType(ParamType):
    """Panel meta option."""

    name = "panel_meta_choice"

    def convert(self, value, param, ctx):
        return click.Choice([name for name, info in Panel.meta.items()
                             if info.type == 'inspect' and info.visible]).convert(value, param, ctx)

PANEL_META_CHOICE_TYPE = PanelMetaChoiceType()

....

@click.command(cls=CeleryCommand,
               context_settings={'allow_extra_args': True})
@click.argument("action", type=PANEL_META_CHOICE_TYPE )
@click.option('-t',
              '--timeout',
              cls=CeleryOption,
              type=float,
              default=1.0,
              help_group='Remote Control Options',
              help='Timeout in seconds waiting for reply.')
@click.option('-d',
              '--destination',
              cls=CeleryOption,
              type=COMMA_SEPARATED_LIST,
              help_group='Remote Control Options',
              help='Comma separated list of destination node names.')
@click.option('-j',
              '--json',
              cls=CeleryOption,
              is_flag=True,
              help_group='Remote Control Options',
              help='Use json as output format.')
@click.pass_context
@handle_preload_options
def inspect(ctx, action, timeout, destination, json, **kwargs):
    """Inspect the worker at runtime.
...
...

I hope you still use this idea useful even if this came a bit late.

Best regards,
Attila

@auvipy
Copy link
Member

auvipy commented Mar 12, 2024

@fekete42 can you open a draft PR, attila?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Celery 5 custom inspect commands doesn't work in the CLI

5 participants