Skip to content

fix(azure): Improve error handling in ingest pipeline on_failure#17176

Merged
andrewkroh merged 6 commits intoelastic:mainfrom
andrewkroh:azure/fix/pipeline_error-on-failure
Mar 10, 2026
Merged

fix(azure): Improve error handling in ingest pipeline on_failure#17176
andrewkroh merged 6 commits intoelastic:mainfrom
andrewkroh:azure/fix/pipeline_error-on-failure

Conversation

@andrewkroh
Copy link
Copy Markdown
Member

@andrewkroh andrewkroh commented Jan 30, 2026

Proposed commit message

Ensures consistent error handling across all Azure ingest pipelines when
processor or pipeline failures occur. Sets event.kind to pipeline_error,
adds preserve_original_event tag, and handles processor-level failures
that don't trigger pipeline-level on_failure handlers.

This ensures that failed events are properly classified and have their
original data preserved for troubleshooting, regardless of whether the
failure occurs at the processor or pipeline level.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

@andrewkroh andrewkroh force-pushed the azure/fix/pipeline_error-on-failure branch from 44525e2 to 46a32cc Compare January 30, 2026 16:32
@andrewkroh andrewkroh added bugfix Pull request that fixes a bug issue Integration:azure Azure Logs labels Jan 30, 2026
@andrewkroh andrewkroh marked this pull request as ready for review January 30, 2026 16:36
@andrewkroh andrewkroh requested review from a team as code owners January 30, 2026 16:36
Update ingest pipeline on_failure handlers to set event.kind to
pipeline_error per best practices. This change updates the 23
pipelines that were missing this processor. The graphactivitylogs and
signinlogs default.yml pipelines already had it and were not modified.

The error.message format was also updated to follow the guidance from
https://github.com/elastic/integrations/wiki/Fleet-Package-Code-Review-Comments#pipeline-on_failure-handler-must-set-errormessage
@andrewkroh andrewkroh force-pushed the azure/fix/pipeline_error-on-failure branch from 46a32cc to 1dda33b Compare January 30, 2026 16:58
@andrewkroh andrewkroh added Team:Obs-InfraObs Observability Infrastructure Monitoring team [elastic/obs-infraobs-integrations] Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations] Team:obs-ds-hosted-services Observability Hosted Services team [elastic/obs-ds-hosted-services] labels Jan 30, 2026
@elasticmachine
Copy link
Copy Markdown

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

Copy link
Copy Markdown
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @andrewkroh.

@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

elastic-vault-github-plugin-prod bot commented Jan 30, 2026

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

Copy link
Copy Markdown
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to set the "preserve_original_event" tag here too?

All Azure ingest pipeline on_failure handlers now append the
preserve_original_event tag. This ensures the original event data
is retained when pipeline processing fails, which is essential for
troubleshooting and debugging issues.
When processor-level on_failure handlers catch errors, they append to
error.message but processing continues without triggering the
pipeline-level on_failure handler. This left documents without
event.kind set to pipeline_error and without the preserve_original_event
tag.

Added conditional processors before pipeline-level on_failure handlers
in five data streams to detect when error.message exists and properly
set event.kind and preserve_original_event tag. This ensures consistent
error handling whether failures occur at the processor or pipeline level.

Affected data streams: auditlogs, firewall_logs, graphactivitylogs,
platformlogs, springcloudlogs.
@andrewkroh andrewkroh changed the title fix(azure): Add event.kind pipeline_error to on_failure handlers fix(azure): Improve error handling in ingest pipeline on_failure Feb 2, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handler in firewall_logs/default.yml previously included a remove processor to clean up intermediate fields (json, _conf, message). This PR removes that cleanup, which would leave temporary fields in error documents. The application_gateway/default.yml file correctly preserves its equivalent remove processor - the same should be done for firewall_logs.


Why is Claude responding? | Type @claude to interact further

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

The on_failure handler previously included a remove processor to clean up
intermediate fields (json, _conf, message). This processor was
accidentally removed during the error handling improvements. Restore it
to ensure these temporary fields are cleaned up even when pipeline
failures occur.
@andrewkroh
Copy link
Copy Markdown
Member Author

Do we want to set the "preserve_original_event" tag here too?

👍

I added a section about this to https://github.com/elastic/integrations/wiki/Fleet-Package-Code-Review-Comments#pipeline-on_failure-handler-must-preserve-original-event

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR correctly improves error handling across Azure ingest pipelines by:

  1. Adding event.kind: pipeline_error to all on_failure handlers - this properly classifies failed events
  2. Adding preserve_original_event tag - ensures original data is retained for troubleshooting
  3. Adding conditional checks before pipeline-level on_failure handlers in files with processor-level error handlers (auditlogs, firewall_logs, graphactivitylogs, platformlogs, springcloudlogs) - these correctly handle the case where processor-level on_failure sets error.message but processing continues
  4. Improving error.message format with mustache conditionals to cleanly handle optional processor tags

The implementation is consistent across all 27 modified files, and the conditional checks are appropriately placed only where needed (after processor-level handlers that set error.message, before the pipeline-level on_failure).


Why is Claude responding? | Type @claude to interact further

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

@andrewkroh andrewkroh enabled auto-merge (squash) February 2, 2026 02:59
@andrewkroh
Copy link
Copy Markdown
Member Author

Requires review from @elastic/obs-infraobs-integrations.

Copy link
Copy Markdown
Contributor

@muthu-mps muthu-mps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Bumped PR version from 1.35.2 to 1.36.1 to follow the 1.36.0 release
that landed on main.
@elasticmachine
Copy link
Copy Markdown

💚 Build Succeeded

History

@andrewkroh andrewkroh merged commit 7aa1ee0 into elastic:main Mar 10, 2026
9 checks passed
@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package azure - 1.36.1 containing this change is available at https://epr.elastic.co/package/azure/1.36.1/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Pull request that fixes a bug issue Integration:azure Azure Logs Team:obs-ds-hosted-services Observability Hosted Services team [elastic/obs-ds-hosted-services] Team:Obs-InfraObs Observability Infrastructure Monitoring team [elastic/obs-infraobs-integrations] Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants