Skip to content

Conversation

@johha
Copy link
Contributor

@johha johha commented Feb 22, 2022

@johha johha force-pushed the job-prio branch 3 times, most recently from 6f3c000 to f1d2555 Compare February 25, 2022 11:08
@johha johha marked this pull request as ready for review February 28, 2022 08:06
@philippthun philippthun self-assigned this Feb 28, 2022
@johha johha requested a review from philippthun March 2, 2022 10:11
johha added 2 commits March 3, 2022 10:29
With this change the job priority can be overwritten in the cloud controller
config. Therefore operators are able to prioritize e.g. cf push over service broker
operations.
It is implemented by an optional 'priorities' config parameter thus it does not
change the current behavior/ priorities. The parameter is a hash
which contains the jobs display_name and its overwritten priority (int). An example configuration could look like this:
  jobs:
    global:
      timeout_in_seconds: 14400
    priorities:
      space.apply_manifest: -10
      buildpack.delete: 10

The priority can also be a negative integer which is supported by the
delayed jobs gem. Negative values are also now correctly handled in
`deprioritize_job()`.

The following display names currently exist in ccng:
"service_binding.delete"
"organization.delete"
"space.delete"
"service_instance.delete"
"service_key.delete"
"service_key.delete"
"service_key.delete"
"app_model.delete"
"buildpack.delete"
"domain.delete"
"droplet_model.delete"
"droplet_model.delete"
"quota_definition.delete"
"packages_model.delete"
"role.delete"
"route.delete"
"security_group.delete"
"service_broker.delete"
"space_quota_definition.delete"
"user.delete"
"space.apply_manifest"
"admin.clear_buildpack_cache"
"service_instance.create"
"service_bindings.create"
"buildpack.upload"
"space.delete_unmapped_routes"
"service_keys.delete"
"service_instance.update"
"service_route_bindings.create"
"service_route_bindings.delete"
"service_keys.create"
"droplet.upload"
"service_bindings.delete"
"service_broker.catalog.synchronize"
"service_broker.update"
@sethboyles
Copy link
Member

Any idea if jobs with 'higher' priority (e.g. priority 0 or 1) could indefinitely prevent jobs with lower priority (e.g. a job with priority 10)? Is that something you all have tested or are concerned about? We read a little bit on the delayed job readme but didn't see a definitive answer there.

@johha
Copy link
Contributor Author

johha commented Mar 4, 2022

Any idea if jobs with 'higher' priority (e.g. priority 0 or 1) could indefinitely prevent jobs with lower priority (e.g. a job with priority 10)? Is that something you all have tested or are concerned about? We read a little bit on the delayed job readme but didn't see a definitive answer there.

As the query includes a run_at filter this doesn't seem to be an issue and thus most of the jobs will be filtered out.

SELECT * FROM \"delayed_jobs\" WHERE ((((\"run_at\" <= '2022-03-04 07:36:03.671035+0000') AND (\"locked_at\" IS NULL)) OR (\"locked_at\" < '2022-03-04 03:36:03.671035+0000') OR (\"locked_by\" = 'cc_global_worker.cc-worker.10.6')) AND (\"failed_at\" IS NULL) AND (\"queue\" IN ('cc-generic', 'app_usage_events', 'audit_events', 'failed_jobs', 'service_usage_events', 'completed_tasks', 'expired_blob_cleanup', 'expired_resource_cleanup', 'expired_orphaned_blob_cleanup', 'orphaned_blobs_cleanup', 'pollable_job_cleanup', 'pending_droplets', 'pending_builds', 'prune_completed_deployments', 'prune_completed_builds', 'prune_excess_app_revisions', 'request_counts_cleanup'))) ORDER BY \"priority\" ASC, \"run_at\" ASC LIMIT 1 FOR UPDATE

Also if a job has multiple attempts its priority will decrease (always doubled: 0 > 1 > 2 > 4 ...) and run_at will increase exponentially:

time + (attempts**4) + 5

This means that we already have jobs with lower priorities and those are still picked up eventually.

However our plan is to use negative priorities (meaning more important) for certain jobs like cf-push so that we avoid changing the current priorities and defaults too much. In case a job with a negative priority fails (and it has retries) the next priorities would be 0,1,2... like for all other jobs.

@philippthun philippthun merged commit ea7029a into cloudfoundry:main Mar 7, 2022
will-gant pushed a commit to sap-contributions/cloud_controller_ng that referenced this pull request Dec 16, 2022
With this change the job priority can be overwritten in the cloud controller
config. Therefore operators are able to prioritize e.g. cf push over service
broker operations.

It is implemented by an optional 'priorities' config parameter thus it does not
change the current behavior/ priorities. The parameter is a hash which contains
the jobs display_name and its overwritten priority (int). An example
configuration could look like this:
  jobs:
    global:
      timeout_in_seconds: 14400
    priorities:
      space.apply_manifest: -10
      buildpack.delete: 10

The priority can also be a negative integer which is supported by the delayed
jobs gem. Negative values are also now correctly handled in
`deprioritize_job()`.

The following display names currently exist in ccng:
"service_binding.delete"
"organization.delete"
"space.delete"
"service_instance.delete"
"service_key.delete"
"service_key.delete"
"service_key.delete"
"app_model.delete"
"buildpack.delete"
"domain.delete"
"droplet_model.delete"
"droplet_model.delete"
"quota_definition.delete"
"packages_model.delete"
"role.delete"
"route.delete"
"security_group.delete"
"service_broker.delete"
"space_quota_definition.delete"
"user.delete"
"space.apply_manifest"
"admin.clear_buildpack_cache"
"service_instance.create"
"service_bindings.create"
"buildpack.upload"
"space.delete_unmapped_routes"
"service_keys.delete"
"service_instance.update"
"service_route_bindings.create"
"service_route_bindings.delete"
"service_keys.create"
"droplet.upload"
"service_bindings.delete"
"service_broker.catalog.synchronize"
"service_broker.update"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants