I'm trying to add some custom metrics using the perform_job.good_job notification, but I'm finding that it's not obvious how to distinguish between successful completions and discards, and I wonder if this is an oversight?
The code in question is this:
|
ActiveSupport::Notifications.instrument("perform_job.good_job", { job: self, execution: execution, process_id: current_thread.process_id, thread_name: current_thread.thread_name }) do |instrument_payload| |
|
value = ActiveJob::Base.execute(active_job_data) |
|
|
|
if value.is_a?(Exception) |
|
handled_error = value |
|
value = nil |
|
end |
|
handled_error ||= current_thread.error_on_retry || current_thread.error_on_discard |
|
|
|
error_event = if handled_error == current_thread.error_on_discard |
|
:discarded |
|
elsif handled_error == current_thread.error_on_retry |
|
:retried |
|
elsif handled_error == current_thread.error_on_retry_stopped |
|
:retry_stopped |
|
elsif handled_error |
|
:handled |
|
end |
|
|
|
instrument_payload.merge!( |
|
value: value, |
|
handled_error: handled_error, |
|
retried: current_thread.retried_job.present?, |
|
error_event: error_event |
|
) |
|
ExecutionResult.new(value: value, handled_error: handled_error, error_event: error_event, retried_job: current_thread.retried_job) |
Specifically the issue is here (annotated ##):
## If the job execution succeeds, `handled_error` is `nil`, since so is
## `current_thread.error_on_retry || current_thread.error_on_discard`.
handled_error ||= current_thread.error_on_retry || current_thread.error_on_discard
error_event = if handled_error == current_thread.error_on_discard ## nil == nil
:discarded ## error_event = :discarded
elsif handled_error == current_thread.error_on_retry
:retried
elsif handled_error == current_thread.error_on_retry_stopped
:retry_stopped
elsif handled_error
:handled
end
due to the logic here, error_event ends up :discarded even if there was no error, and this is what ends up on payload[:error_event]:
(byebug) event.payload
{:job=>#<GoodJob::Job id: "32da31bc-10e5-422f-9458-8ca96830baf6", queue_name: "default", priority: 0, serialized_params: {"job_id"=>"32da31bc-10e5-422f-9458-8ca96830baf6", "locale"=>"en", "priority"=>nil, "timezone"=>"UTC", "arguments"=>[5], "job_class"=>"TestJob", "executions"=>0, "queue_name"=>"default", "enqueued_at"=>"2025-02-21T17:57:22Z", "provider_job_id"=>nil, "exception_executions"=>{}}, scheduled_at: "2025-02-21 17:57:22.786207000 +0000", performed_at: "2025-02-21 17:57:22.821524000 +0000", finished_at: nil, error: nil, created_at: "2025-02-21 17:57:22.786207000 +0000", updated_at: "2025-02-21 17:57:22.830189000 +0000", active_job_id: "32da31bc-10e5-422f-9458-8ca96830baf6", concurrency_key: nil, cron_key: nil, cron_at: nil, batch_id: nil, batch_callback_id: nil, executions_count: 1, job_class: "TestJob", error_event: nil, labels: nil, locked_by_id: "b49f31a7-9a91-4642-a2c1-ef6db94f3305", locked_at: "2025-02-21 17:57:22.821696000 +0000">, :execution=>#<GoodJob::Execution id: "a568dbe9-a7a0-426f-991e-7952076362c2", created_at: "2025-02-21 17:57:22.821524000 +0000", updated_at: "2025-02-21 17:57:22.822639000 +0000", active_job_id: "32da31bc-10e5-422f-9458-8ca96830baf6", job_class: "TestJob", queue_name: "default", serialized_params: {"job_id"=>"32da31bc-10e5-422f-9458-8ca96830baf6", "locale"=>"en", "priority"=>nil, "timezone"=>"UTC", "arguments"=>[5], "job_class"=>"TestJob", "executions"=>0, "queue_name"=>"default", "enqueued_at"=>"2025-02-21T17:57:22Z", "provider_job_id"=>nil, "exception_executions"=>{}}, scheduled_at: "2025-02-21 17:57:22.786207000 +0000", finished_at: nil, error: nil, error_event: nil, error_backtrace: nil, process_id: "b49f31a7-9a91-4642-a2c1-ef6db94f3305", duration: nil>, :process_id=>32948, :thread_name=>"GoodJob::Scheduler(queues=* max_threads=5)-thread-1", :value=>nil, :handled_error=>nil, :retried=>false, :error_event=>:discarded}
As a work-around you can instead check that handled_error (and unhandled_error?) are nil (which is what the following code in fact does) but this is not super intuitive.
Calling the helper methods on payload[:job] or payload[:execution] doesn't help either, since the fields haven't been updated yet.
Have you a recommendation?
I'm trying to add some custom metrics using the
perform_job.good_jobnotification, but I'm finding that it's not obvious how to distinguish between successful completions and discards, and I wonder if this is an oversight?The code in question is this:
good_job/app/models/good_job/job.rb
Lines 628 to 653 in b557525
Specifically the issue is here (annotated
##):due to the logic here,
error_eventends up:discardedeven if there was no error, and this is what ends up onpayload[:error_event]:As a work-around you can instead check that
handled_error(andunhandled_error?) arenil(which is what the following code in fact does) but this is not super intuitive.Calling the helper methods on
payload[:job]orpayload[:execution]doesn't help either, since the fields haven't been updated yet.Have you a recommendation?