Skip to content

add typing, choices and constraints support#63

Merged
evgeni merged 5 commits intomasterfrom
data-types
Apr 25, 2025
Merged

add typing, choices and constraints support#63
evgeni merged 5 commits intomasterfrom
data-types

Conversation

@evgeni
Copy link
Member

@evgeni evgeni commented Apr 9, 2025

No description provided.

@evgeni evgeni force-pushed the data-types branch 2 times, most recently from eee1717 to f59574d Compare April 9, 2025 07:29
@evgeni
Copy link
Member Author

evgeni commented Apr 9, 2025

For "if A passed then also B needs to be passed" I thought the easiest would be to add a new metadata field constraints and then have something like this:

constraints:
  required_together:
    - [A, B]
    - [C, D]
  required_one_of:
    - [E, F]
  mutually_exclusive:
    - [X, Y]

And the validation is then wired up when argparse is done parsing:

args = parser.parse_args(cliargs)

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks rather straight forward.

In Puppet we often use Integer[0] or String[1]. Have you thought about that? Would we create special types, like PositiveInteger and NonEmptyString?

Path type, accepting anything that looks like a path
"""
def validate(self, string):
return string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use PurePath to validate and normalize?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, possible.

don't look too closely at the individual implementations, there are mostly there to show "it works" ;)

@evgeni
Copy link
Member Author

evgeni commented Apr 9, 2025

In Puppet we often use Integer[0] or String[1]. Have you thought about that? Would we create special types, like PositiveInteger and NonEmptyString?

This is indeed something still on my in-brain todo list. Special types for these cases seem fine.

I wonder if we also want to have things like String[1,64] (aka: can't be longer than 64 chars), but that'd be harder to implement -- the dynamic "64" part that is.

@ekohl
Copy link
Member

ekohl commented Apr 9, 2025

I wonder if we also want to have things like String[1,64] (aka: can't be longer than 64 chars), but that'd be harder to implement -- the dynamic "64" part that is.

It's rare for strings:

$ rg 'String\[\d[,\d]' */manifests
puppet-foreman_proxy_content/manifests/init.pp
103:  Optional[String[50]] $pulpcore_django_secret_key = undef,

puppet-foreman/manifests/settings_fragment.pp
13:  String[2, 2] $order,

puppet-certs/manifests/foreman_proxy.pp
25:  String[2,2] $country = $certs::country,

puppet-certs/manifests/ca.pp
7:  String[2,2] $country = $certs::country,

puppet-certs/manifests/foreman.pp
11:  String[2,2] $country = $certs::country,

puppet-certs/manifests/candlepin.pp
15:  String[2,2] $country = $certs::country,

puppet-certs/manifests/apache.pp
57:  String[2,2] $country = $certs::country,

puppet-certs/manifests/puppet.pp
11:  String[2,2] $country = $certs::country,

puppet-certs/manifests/init.pp
77:  String[2,2] $country = 'US',

It's more common for integers, but even those are rare:

$ rg 'Integer\[\d+,' */manifestspuppet-dns/manifests/logging/channel.pp
37:  Integer[51, 59] $order                             = 51,

puppet-dns/manifests/logging/category.pp
10:  Integer[51, 59] $order = 55,

puppet-foreman_proxy_content/manifests/init.pp
112:  Optional[Integer[1,100]] $pulpcore_import_workers_percent = undef,

puppet-foreman_proxy/manifests/init.pp
376:  Integer[0, 255] $dhcp_load_split = 255,

puppet-dhcp/manifests/failover.pp
36:  Variant[Integer[0, 65535], String] $port = 519,

puppet-puppet/manifests/init.pp
606:  Optional[Integer[0,23]] $run_hour = undef,
607:  Variant[Integer[0,59], Array[Integer[0,59]], Undef] $run_minute = undef,

puppet-puppet/manifests/agent/service/cron.pp
5:  Optional[Integer[0,23]] $hour                                = undef,
6:  Variant[Integer[0,59], Array[Integer[0,59]], Undef] $minute  = undef,

puppet-puppet/manifests/agent/service/systemd.pp
5:  Optional[Integer[0,23]] $hour                                = undef,
6:  Variant[Integer[0,59], Array[Integer[0,59]], Undef] $minute  = undef,

puppet-foreman/manifests/dynflow/pool.pp
31:    Integer[1, $instances].each |$n| {

puppet-foreman/manifests/repos/yum.pp
9:  Variant[Integer[0, 99], Enum['absent']] $priority = 'absent',

puppet-foreman/manifests/plugin/supervisory_authority.pp
35:  Integer[0,5]                 $log_level             = 1,

puppet-pulpcore/manifests/init.pp
270:  Optional[Integer[1,100]] $import_workers_percent = undef,

puppet-pulpcore/manifests/service.pp
42:  Integer[1, $pulpcore::worker_count].each |$n| {

Various are internal implementation details, one really should be Stdlib::Port. But there certainly are a few here.

@evgeni
Copy link
Member Author

evgeni commented Apr 9, 2025

Variant[Integer[0, 99], Enum['absent']] $priority = 'absent',

yikes :)

@ekohl
Copy link
Member

ekohl commented Apr 9, 2025

Let's not manage yum repos in this installer, shall we? ;)

@evgeni
Copy link
Member Author

evgeni commented Apr 9, 2025

Yepp. From yesterdays "perforce's apt key expired" discussion in the vox channel:

<Willem Basson> does theforeman-puppet have a solution for using replacing the key?
< Zhenech> yes, by not having any repo-setup backed in at all ;-)

@evgeni evgeni force-pushed the data-types branch 2 times, most recently from 7e61977 to dd1a46d Compare April 9, 2025 14:29
@evgeni evgeni force-pushed the data-types branch 2 times, most recently from ba343eb to a351b41 Compare April 10, 2025 05:47
Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty to push a commit that uses a registry. Feel free to drop it if you don't like it.

@evgeni
Copy link
Member Author

evgeni commented Apr 10, 2025

I am not sure I like the registry -- the real one is in argparse anyway, so it seems double-registring… on the other hand, if it makes thing easier…

@ekohl
Copy link
Member

ekohl commented Apr 10, 2025

I am not sure I like the registry -- the real one is in argparse anyway, so it seems double-registring… on the other hand, if it makes thing easier…

I think it can make testing easier

on multiple lines

with an explicit newline
variables:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the Ansible variables being defined or the input parameters? If they are intended to be the latter, is it possible to change this nomenclature? I'm feeling the overuse of the term variables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ansible vars. the parameter name is identical, unless you override it, like here:

mapped:
parameter: --explicit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you thinking about parameters that would map to multiple variables? Best example I could think of is something like --database-host where eventually it needs to map to foreman_database_host, candlepin_database_host, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, and it translates _ in vars to - in params, by default:

obsah/obsah/__init__.py

Lines 120 to 123 in 0d403fd

try:
parameter = options['parameter']
except KeyError:
parameter = '--{}'.format(name.removeprefix(namespace).replace('_', '-'))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you thinking about parameters that would map to multiple variables? Best example I could think of is something like --database-host where eventually it needs to map to foreman_database_host, candlepin_database_host, etc.

I'd thought of doing a variable/jinja layer. So the user gets --database-host that is translated to database_host and the others are defined as project_database_host: "{{ database_host }}"

(we have this aliasing in kafo too, we just don't use it)

choices:
- journal
- file
constraints:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks nice and intuitive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constraint won't handle when a parameter's value determines mutually exclusive or required together.

I put this into the prototype (https://github.com/theforeman/foreman-quadlet/pull/131/files#diff-43fef5593cf5b8e6bad9b75f2bc286a354420a4d9891ff6c521d7192236c16f0R42) and that got me thinking about the database parameters. When we want to configure an external database we need to require the user specify at least database host. If we go with a parameter named --database-mode where the value of it determines the state, then we need some way to say that two parameters are required together but only if the value of that parameter is X. Similarly to be able to say if --database-mode is internal, then the user should not be able to set any of the other database parameters except maybe password.

Copy link
Member Author

@evgeni evgeni Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Ansible (where I took the naming and the syntax from), there is required_if, that does what you want. We certainly could implement that too. Then you'd write something like

required_if:
  - ['database_mode', 'external', ['database_host', 'database_port', …]]

There is no reverse of this in Ansible (as in, if database_mode is internal, you can't forbid setting host and port). But we could do something own (forbidden_if?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented required_if and forbidden_if.

@evgeni evgeni force-pushed the data-types branch 3 times, most recently from b792c70 to baf71e2 Compare April 11, 2025 06:34
Comment on lines +74 to +78
for include in data.get('include', []):
include_path = os.path.join(self.application_config.playbooks_path(), include, self.application_config.metadata_name())
include_data = self._load_metadata_file(include_path)
data['variables'] = data.get('variables', {}) | include_data.get('variables', {})
data['constraints'] = data.get('constraints', {}) | include_data.get('constraints', {})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am abusing this PR for more experiments than just "typing", but here we are…

@ehelms wanted to have a way to have a composable CLI, that allows things like:

# mycli step1
# mycli step2
# mycli step3

but also

# mycli justdoit (runs steps 1, 2 and 3 under the hood)

today, to implement that, you'd create step1, step2 and step3 as individual playbook+metadata sets, but for justdoit you'd need to duplicate all the metadata -- especially all the variable definitions. (the same is of course true for playbooks, but these usually have a single role invocation anyway)

duplicating things sounds dangerous (what if we forget to expose a new var in one of the copies…)

the alternative is to provide "includes" for the metadata, in a similar fashion how you can include playbooks and roles, which these lines implement.

it intentionally doesn't do recursive inclusion and right now the included data wins over the original definition (if any).

is that too crazy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would need to see an example of a command and then an includes command. Is it justdoit including the sub-commands? or the other way around?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so assuming we deploy foreman, we have three steps: certs, foreman, proxy, you'd have something like this:

% tree
.
└── playbooks
    ├── certs
    │   ├── certs.yaml
    │   └── obsah.metadata.yaml
    ├── foreman
    │   ├── foreman.yaml
    │   └── obsah.metadata.yaml
    ├── install
    │   ├── install.yaml
    │   └── obsah.metadata.yaml
    └── proxy
        ├── obsah.metadata.yaml
        └── proxy.yaml

% for f in  playbooks/*/*yaml; do echo "=== ${f} ==="; cat ${f}; done
=== playbooks/certs/certs.yaml ===
- hosts: localhost
  roles:
    - certs
=== playbooks/certs/obsah.metadata.yaml ===
help: certs generation
variables:
  …
=== playbooks/foreman/foreman.yaml ===
- hosts: localhost
  roles:
    - foreman
=== playbooks/foreman/obsah.metadata.yaml ===
help: install foreman
variables:
  …
=== playbooks/install/install.yaml ===
- hosts: localhost
  roles:
    - certs
    - foreman
    - proxy
=== playbooks/install/obsah.metadata.yaml ===
help: perform all installation steps
include:
  - certs
  - foreman
  - proxy
=== playbooks/proxy/obsah.metadata.yaml ===
help: install proxy
variables:
  …
=== playbooks/proxy/proxy.yaml ===
- hosts: localhost
  roles:
    - proxy

so, the "justdoit" (here "install") includes the others

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That reads reasonable to me. This is only composing metadata right? Would this mean everything needs to be encapsulated into roles, which means meta roles? I am thinking about how https://github.com/theforeman/foreman-quadlet/pull/119/files#diff-3413150b8dd202866a8852de783336a13f83155768b86fc4f353dcae2ae7f4df got structured where the playbook is split across the functions and has variable definitions in it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building on that thought, how would you think about handling cases where the complexity of what's happening feels wrong to duplicate in two playbooks? For example, duplicating this logic for handling certificates feels error prone:

---
- name: Generate default certificates
  hosts:
    - quadlet
  become: true
  vars_files:
    - "../../vars/{{ certificate_source }}_certificates.yml"
  roles:
    - role: certificates
      when: "certificate_source == 'default'"
    - role: foreman_installer_certs
      when: "certificate_source == 'installer'"
    - role: certificate_checks
      vars:
        certificate_checks_certificate: "{{ server_certificate }}"
        certificate_checks_key: "{{ server_key }}"
        certificate_checks_ca: "{{ ca_certificate }}"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need meta roles, no, you can still have a "hand full" in each playbook, overlapping.
Or we try import_playbook in install.yml (I didn't yet).

As for the complex example:

  • I'd put the vars in a separate vars file that maps input names to role names (esp when we have multiple roles using the same input (think ext-db as my primary example)
  • the when is more complicated -- probably import_playbook is the more correct answer here

@ehelms
Copy link
Member

ehelms commented Apr 17, 2025

Have you thought about default values? An example I was thinking through was where it the right place to define a default database password:

  1. Ansible level (e.g. https://github.com/theforeman/foreman-quadlet/pull/134/files#diff-3263dcfdc300cc5d099f21680fce29190e77d616f9e2f7df1b881f6edc6bc26fR2)
  2. Default value at the CLI level
  3. Require user/automation/tooling to have to provide it as an input

I feel like the Ansible level will require an "answer file" like object to exist to prevent overriding a users input.
The default value at the CLI level means if a parameter is not supplied, it will use the default but otherwise not override any previous input, maybe?
The third option means there will be more things that are always required but tooling can wrap that. This can make the automation/testing/happy path harder but the actions more predictable. As in, nothing overrides a value unless it is given as input.

@evgeni
Copy link
Member Author

evgeni commented Apr 22, 2025

Have you thought about default values? An example I was thinking through was where it the right place to define a default database password:

I have not, no. There is a task "look into this" on our todo list, but it hasn't been touched yet.

@ehelms
Copy link
Member

ehelms commented Apr 24, 2025

Three other thoughts that came to mind:

  • I think we talked about grouping before and that would be useful thinking about database parameters.
  • I wonder if we should go with a similar help and full help model here. This can be figured out later.

Can we simplify the output to make the help more streamlined (this is cosmetic)? Looking at the example, I am not sure I understand the point of CANDLEPIN_DATABASE_NAME showing up in the help and forcing it to be multiline.

  --candlepin-database-name CANDLEPIN_DATABASE_NAME
                        Name of the Candlepin database.
  --candlepin-database-password CANDLEPIN_DATABASE_PASSWORD
                        Password for the Candlepin database.
  --candlepin-database-user CANDLEPIN_DATABASE_USER
                        User of the Candlepin database.

I would expect more of a:

  --candlepin-database-name        Name of the Candlepin database.
  --candlepin-database-password    Password for the Candlepin database.
  --candlepin-database-user        User of the Candlepin database.

@evgeni
Copy link
Member Author

evgeni commented Apr 24, 2025

Three other thoughts that came to mind:

  • I think we talked about grouping before and that would be useful thinking about database parameters.

We did, but in a different context ("all of the params in the group need to be set", "one of the params in the group needs to be set"), and ended not implementing it as the "naive" constraints worked quite well.
I read your question more like visual grouping?

  • I wonder if we should go with a similar help and full help model here. This can be figured out later.

I always hated that part of the installer cli :)

Can we simplify the output to make the help more streamlined (this is cosmetic)? Looking at the example, I am not sure I understand the point of CANDLEPIN_DATABASE_NAME showing up in the help and forcing it to be multiline.

  --candlepin-database-name CANDLEPIN_DATABASE_NAME
                        Name of the Candlepin database.
  --candlepin-database-password CANDLEPIN_DATABASE_PASSWORD
                        Password for the Candlepin database.
  --candlepin-database-user CANDLEPIN_DATABASE_USER
                        User of the Candlepin database.

I would expect more of a:

  --candlepin-database-name        Name of the Candlepin database.
  --candlepin-database-password    Password for the Candlepin database.
  --candlepin-database-user        User of the Candlepin database.

I can look whether this can be suppressed, but that's actually help/manpage convention to denote params that need a value (vs flag style things).

@evgeni evgeni force-pushed the data-types branch 2 times, most recently from d0f8fc3 to 5e6cc1f Compare April 24, 2025 09:48
@ehelms
Copy link
Member

ehelms commented Apr 24, 2025

I read your question more like visual grouping?

Yes.

  • I wonder if we should go with a similar help and full help model here. This can be figured out later.

I always hated that part of the installer cli :)

Fair - I just wonder about how we do not overwhelm but we can cross that bridge once we see the full extent of options.

I can look whether this can be suppressed, but that's actually help/manpage convention to denote params that need a value (vs flag style things).

Ahh interesting. Just poking around I see a few different styles for this:

  --keytab=FILE         specify a Kerberos keytab to use
  --principal=PRINCIPAL
                        specify a Kerberos principal to use
  -c <command>		Execute <command> after loading the first file
   -S <session>		Source file <session> after loading the first file

@ehelms ehelms mentioned this pull request Apr 24, 2025
@evgeni evgeni force-pushed the data-types branch 2 times, most recently from 4399f4b to cced844 Compare April 25, 2025 10:58
@evgeni evgeni changed the title add simple typing support add typing, choices and constraints support Apr 25, 2025
@evgeni evgeni marked this pull request as ready for review April 25, 2025 11:02
@evgeni evgeni merged commit 89dc194 into master Apr 25, 2025
13 checks passed
@evgeni evgeni deleted the data-types branch April 25, 2025 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants