feat: incident.io Notifier by rorymalcolm · Pull Request #4372 · prometheus/alertmanager

rorymalcolm · 2025-04-27T22:04:59Z

Adds the technical implementation, and tests, for the incident.io notifier
Configured through the following config:

receivers:
  - name: 'incidentio-notifications'
    incidentio_configs:
      - url: '$alert_source_url'
        alert_source_token: '$alert_source_token'

axdotl · 2025-04-30T06:10:22Z

config/notifiers.go

 }

+// IncidentioConfig configures notifications via incident.io.
+type IncidentioConfig struct {


What do you think about adding a field Metadata to this struct? Something similar to the Details field in the OpsgenieConfig struct.
This would enable users to define additional data (see incident.io API definition for reference).

config/notifiers.go

notify/incidentio/incidentio.go

grobinson-grafana · 2025-05-03T10:58:53Z

Looks good, a couple comments. It also needs docs here and here 👍

rorymalcolm · 2025-05-07T07:07:02Z

Looks good, a couple comments. It also needs docs here and here 👍

All done - I think? 🙏

grobinson-grafana · 2025-05-08T16:05:02Z

You have some lint failures in notify/incidentio/incidentio_test.go

grobinson-grafana · 2025-05-10T07:07:56Z

You still have some failing tests I'm afraid https://github.com/prometheus/alertmanager/actions/runs/14933150149/job/41982648209?pr=4372

--- FAIL: TestIncidentIORetry (0.01s)
    incidentio_test.go:48: 
        	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:48
        	Error:      	Received unexpected error:
        	            	one of alert_source_token or alert_source_token_file must be configured
        	Test:       	TestIncidentIORetry
--- FAIL: TestIncidentIORedactedURL (0.01s)
    incidentio_test.go:69: 
        	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:69
        	Error:      	Received unexpected error:
        	            	one of alert_source_token or alert_source_token_file must be configured
        	Test:       	TestIncidentIORedactedURL
--- FAIL: TestIncidentIOURLFromFile (0.01s)
    incidentio_test.go:91: 
        	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:91
        	Error:      	Received unexpected error:
        	            	one of alert_source_token or alert_source_token_file must be configured
        	Test:       	TestIncidentIOURLFromFile
--- FAIL: TestIncidentIONotify (0.01s)
    incidentio_test.go:[143](https://github.com/prometheus/alertmanager/actions/runs/14933150149/job/41982648209?pr=4372#step:6:144): 
        	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:143
        	Error:      	Received unexpected error:
        	            	one of alert_source_token or alert_source_token_file must be configured
        	Test:       	TestIncidentIONotify
--- FAIL: TestIncidentIORetryScenarios (0.03s)
    --- FAIL: TestIncidentIORetryScenarios/success_response (0.01s)
        incidentio_test.go:223: 
            	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:223
            	Error:      	Received unexpected error:
            	            	one of alert_source_token or alert_source_token_file must be configured
            	Test:       	TestIncidentIORetryScenarios/success_response
    --- FAIL: TestIncidentIORetryScenarios/rate_limit_response (0.01s)
        incidentio_test.go:223: 
            	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:223
            	Error:      	Received unexpected error:
            	            	one of alert_source_token or alert_source_token_file must be configured
            	Test:       	TestIncidentIORetryScenarios/rate_limit_response
    --- FAIL: TestIncidentIORetryScenarios/server_error_response (0.01s)
        incidentio_test.go:223: 
            	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:223
            	Error:      	Received unexpected error:
            	            	one of alert_source_token or alert_source_token_file must be configured
            	Test:       	TestIncidentIORetryScenarios/server_error_response
    --- FAIL: TestIncidentIORetryScenarios/client_error_response (0.01s)
        incidentio_test.go:223: 
            	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:223
            	Error:      	Received unexpected error:
            	            	one of alert_source_token or alert_source_token_file must be configured
            	Test:       	TestIncidentIORetryScenarios/client_error_response

rorymalcolm · 2025-05-12T13:41:24Z

You still have some failing tests I'm afraid https://github.com/prometheus/alertmanager/actions/runs/14933150149/job/41982648209?pr=4372

--- FAIL: TestIncidentIORetry (0.01s)
    incidentio_test.go:48: 
        	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:48
        	Error:      	Received unexpected error:
        	            	one of alert_source_token or alert_source_token_file must be configured
        	Test:       	TestIncidentIORetry
--- FAIL: TestIncidentIORedactedURL (0.01s)
    incidentio_test.go:69: 
        	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:69
        	Error:      	Received unexpected error:
        	            	one of alert_source_token or alert_source_token_file must be configured
        	Test:       	TestIncidentIORedactedURL
--- FAIL: TestIncidentIOURLFromFile (0.01s)
    incidentio_test.go:91: 
        	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:91
        	Error:      	Received unexpected error:
        	            	one of alert_source_token or alert_source_token_file must be configured
        	Test:       	TestIncidentIOURLFromFile
--- FAIL: TestIncidentIONotify (0.01s)
    incidentio_test.go:[143](https://github.com/prometheus/alertmanager/actions/runs/14933150149/job/41982648209?pr=4372#step:6:144): 
        	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:143
        	Error:      	Received unexpected error:
        	            	one of alert_source_token or alert_source_token_file must be configured
        	Test:       	TestIncidentIONotify
--- FAIL: TestIncidentIORetryScenarios (0.03s)
    --- FAIL: TestIncidentIORetryScenarios/success_response (0.01s)
        incidentio_test.go:223: 
            	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:223
            	Error:      	Received unexpected error:
            	            	one of alert_source_token or alert_source_token_file must be configured
            	Test:       	TestIncidentIORetryScenarios/success_response
    --- FAIL: TestIncidentIORetryScenarios/rate_limit_response (0.01s)
        incidentio_test.go:223: 
            	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:223
            	Error:      	Received unexpected error:
            	            	one of alert_source_token or alert_source_token_file must be configured
            	Test:       	TestIncidentIORetryScenarios/rate_limit_response
    --- FAIL: TestIncidentIORetryScenarios/server_error_response (0.01s)
        incidentio_test.go:223: 
            	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:223
            	Error:      	Received unexpected error:
            	            	one of alert_source_token or alert_source_token_file must be configured
            	Test:       	TestIncidentIORetryScenarios/server_error_response
    --- FAIL: TestIncidentIORetryScenarios/client_error_response (0.01s)
        incidentio_test.go:223: 
            	Error Trace:	/__w/alertmanager/alertmanager/notify/incidentio/incidentio_test.go:223
            	Error:      	Received unexpected error:
            	            	one of alert_source_token or alert_source_token_file must be configured
            	Test:       	TestIncidentIORetryScenarios/client_error_response

Ah - apols; fixed!

rorymalcolm · 2025-05-23T09:23:54Z

Hey @grobinson-grafana - just confirming this is good to go?

grobinson-grafana · 2025-05-23T09:31:39Z

I haven't had time to review the latest changes, but I hope to be able to do that over the weekend. If the docs are looking good and code + tests still look good I hope to merge this over the weekend.

axdotl · 2025-06-10T10:48:56Z

Hello @grobinson-grafana - any chance to get this merged soon? Looking forward to make use of it.
Thanks in advance!

ankitdh7 · 2025-06-20T15:42:22Z

Any chance to get this merge soon? It will be a great improvement over webhook.

notify/incidentio/incidentio_test.go

grobinson-grafana · 2025-07-08T11:34:58Z

Just heads up, we have an issue with Circle CI being broken for Alertmanager and are figuring out with other Prometheus contributors how to fix it.

grobinson-grafana · 2025-07-22T18:17:53Z

Hi! Our CI is fixed, so will do a final review and then merge this. Thanks for your patience, I know this has been open since April.

grobinson-grafana

Fixed the lint error, but I think there is a bug where it doesn't inherit the global HTTP config from the configuration file, and some of the truncation doesn't work.

notify/incidentio/incidentio.go

grobinson-grafana · 2025-07-22T21:57:16Z

notify/incidentio/incidentio.go

+}
+
+// encodeMessage encodes the message and truncates content if it exceeds maxPayloadSize.
+func (n *Notifier) encodeMessage(msg *Message) (bytes.Buffer, error) {


I think the step based reduction is clever, but I am a bit worried about the number of times we marshal the message when reducing its final size. JSON Marshal in Go is expensive from both the perspective of CPU and allocations. I appreciate in the worst case its effectively O(logN) marshals as you half the number of alerts each time, but I'm also thinking about this from the perspective of multi-tenant projects like Cortex and Mimir that might have 1000s or 10,000s of tenants "sharing" an Alertmanager. It's hard to say this is fine without any data to prove it.

Have you tried keeping a running estimate on the size instead to avoid remarshalling? In contrast, adding ints is one of the fastest operations you can do on a CPU, so it will have a massive difference in overhead. If you want to take into account the byte overhead of the JSON syntax (braces, commas, etc), you can estimate this as 5% of the final payload size.

I had a chat with Rory and decided to just go for a much simpler implementation here: if the full message is too large, drop all but the first alert and try that.

I've written up the full reasoning in the commit, but I think this is both more efficient (generally 1 encoding, occasionally 2), and a better experience for the humans involved: no strange mangled alerts coming through, just some additional info dropped.

- Adds the technical implementation, and tests, for the incident.io notifier - Configured through the following config: ```yaml receivers: - name: 'incidentio-notifications' incidentio_configs: - url: '$alert_source_url' alert_source_token: '$alert_source_token' ``` - Add documentation for the incidentio_config Signed-off-by: Isaac Seymour <i.seymour@oxon.org>

grobinson-grafana · 2025-07-22T18:17:53Z

Hi! Our CI is fixed, so will do a final review and then merge this. Thanks for your patience, I know this has been open since April.

grobinson-grafana

Fixed the lint error, but I think there is a bug where it doesn't inherit the global HTTP config from the configuration file, and some of the truncation doesn't work.

notify/incidentio/incidentio.go

grobinson-grafana · 2025-07-22T21:57:16Z

notify/incidentio/incidentio.go

+}
+
+// encodeMessage encodes the message and truncates content if it exceeds maxPayloadSize.
+func (n *Notifier) encodeMessage(msg *Message) (bytes.Buffer, error) {


I think the step based reduction is clever, but I am a bit worried about the number of times we marshal the message when reducing its final size. JSON Marshal in Go is expensive from both the perspective of CPU and allocations. I appreciate in the worst case its effectively O(logN) marshals as you half the number of alerts each time, but I'm also thinking about this from the perspective of multi-tenant projects like Cortex and Mimir that might have 1000s or 10,000s of tenants "sharing" an Alertmanager. It's hard to say this is fine without any data to prove it.

Have you tried keeping a running estimate on the size instead to avoid remarshalling? In contrast, adding ints is one of the fastest operations you can do on a CPU, so it will have a massive difference in overhead. If you want to take into account the byte overhead of the JSON syntax (braces, commas, etc), you can estimate this as 5% of the final payload size.

I had a chat with Rory and decided to just go for a much simpler implementation here: if the full message is too large, drop all but the first alert and try that.

I've written up the full reasoning in the commit, but I think this is both more efficient (generally 1 encoding, occasionally 2), and a better experience for the humans involved: no strange mangled alerts coming through, just some additional info dropped.

- Adds the technical implementation, and tests, for the incident.io notifier - Configured through the following config: ```yaml receivers: - name: 'incidentio-notifications' incidentio_configs: - url: '$alert_source_url' alert_source_token: '$alert_source_token' ``` - Add documentation for the incidentio_config Signed-off-by: Isaac Seymour <i.seymour@oxon.org>

isaacseymour · 2025-08-06T20:24:55Z

Hey @grobinson-grafana - I believe this should be ready to go at this point!

Rather than carefully trying to shrink the size of the payload to fit the 512kB limit, just try two encodings: 1. The full original message; and 2. Remove all but the first alert in the group and send that For most configurations, each message creates a single alert in incident.io, with the details of the alerts which made up that group contained within being useful but not essential. This means the code is a lot simpler, and does at-most 2 JSON encodings for each message (although in general it should be pretty rare to need to do more than 1!) Signed-off-by: Isaac Seymour <i.seymour@oxon.org>

isaacseymour · 2025-08-27T16:49:02Z

👋 Hey @grobinson-grafana, would love to get this merged if you have time to review! 🙏

paulyehorov · 2025-09-09T08:42:23Z

Hey folks,
Just stumbled upon this - we really-really need this. Recently migrated to incident.io on-call and missing some very useful features due to being limited to webhook usage.

gotjosh · 2025-09-22T14:51:56Z

I'm not sure I follow what's going on - but I'm not able to add commits to your pull request directly despite having Maintainers are allowed to edit this pull request. enabled.

To speed things up, I've made the changes myself (as some of them are a bit pedantic e.g. ending with dots (.) on comments).

I have attached the diff of both commits I've made - I'd appreciate it if you could incorporate them and let me know your thoughts.

linter-fix.diff.txt
patch.diff.txt

Most of the changes are purely cosmetic but do let me know if you have any questions on them - if you can please incorporate them I'll be happy to merge this.

Signed-off-by: Isaac Seymour <i.seymour@oxon.org>

isaacseymour · 2025-09-22T15:27:33Z

That is indeed mysterious: it seems like Github allows you to merge main in, but not push changes.

I've applied your patches in c22bd50. Thank you so much @gotjosh!

Signed-off-by: gotjosh <josue.abreu@gmail.com>

gotjosh · 2025-09-22T16:10:34Z

That is indeed mysterious: it seems like Github allows you to merge main in, but not push changes.

It does allow me to do anything through the UI. I'm not sure why is not allowing me via other means

Anyways, I've double checked @grobinson-grafana's comments and they all seem to be addressed now so this LGTM.

gotjosh

LGTM

gotjosh · 2025-09-22T16:11:41Z

As for the release, I'm aiming to set some time aside next week to review a few PRs, get them in and then do the release with @grobinson-grafana.

gotjosh · 2025-09-22T16:12:37Z

Thank you very much for your contribution @isaacseymour and @rorymalcolm ❤️

rorymalcolm changed the title ~~- Adds the technical implementation, and tests, for the incident.io n…~~ incident.io Notifier Apr 27, 2025

rorymalcolm force-pushed the rorymalcolm/incidentio-notifier branch 2 times, most recently from 98ed08f to 416cac1 Compare April 27, 2025 22:09

rorymalcolm changed the title ~~incident.io Notifier~~ feat: incident.io Notifier Apr 27, 2025

rorymalcolm force-pushed the rorymalcolm/incidentio-notifier branch 3 times, most recently from 118a310 to 4bc00a7 Compare April 28, 2025 09:49

axdotl reviewed Apr 30, 2025

View reviewed changes

grobinson-grafana reviewed May 3, 2025

View reviewed changes

rorymalcolm force-pushed the rorymalcolm/incidentio-notifier branch 3 times, most recently from 17d6a6f to 255befe Compare May 7, 2025 07:06

rorymalcolm force-pushed the rorymalcolm/incidentio-notifier branch 2 times, most recently from 8f16fcd to bbc39ce Compare May 9, 2025 16:14

rorymalcolm force-pushed the rorymalcolm/incidentio-notifier branch from bbc39ce to 07604e3 Compare May 11, 2025 21:03

nikita-vanyasin reviewed Jul 8, 2025

View reviewed changes

notify/incidentio/incidentio_test.go Show resolved Hide resolved

rorymalcolm force-pushed the rorymalcolm/incidentio-notifier branch from 7f60f95 to 34eac10 Compare July 14, 2025 08:50

grobinson-grafana reviewed Jul 22, 2025

View reviewed changes

rorymalcolm force-pushed the rorymalcolm/incidentio-notifier branch from 7f60f95 to 34eac10 Compare July 14, 2025 08:50

grobinson-grafana reviewed Jul 22, 2025

View reviewed changes

isaacseymour force-pushed the rorymalcolm/incidentio-notifier branch from d0be1dc to 0973fd3 Compare August 6, 2025 16:48

isaacseymour force-pushed the rorymalcolm/incidentio-notifier branch from 0973fd3 to 0d86b5b Compare August 7, 2025 08:20

rorymalcolm requested a review from grobinson-grafana September 3, 2025 12:08

Merge branch 'main' into rorymalcolm/incidentio-notifier

edb9a4d

Merge branch 'main' into rorymalcolm/incidentio-notifier

63bc1ef

Apply comments from @gotjosh

c22bd50

Signed-off-by: Isaac Seymour <i.seymour@oxon.org>

isaacseymour force-pushed the rorymalcolm/incidentio-notifier branch from 583bc7e to c22bd50 Compare September 22, 2025 15:27

Change maxPayloadSize comment to MaxPayloadSize

803d9e2

Signed-off-by: gotjosh <josue.abreu@gmail.com>

gotjosh approved these changes Sep 22, 2025

View reviewed changes

gotjosh merged commit 28d3f86 into prometheus:main Sep 22, 2025
11 checks passed

heliapb mentioned this pull request Oct 10, 2025

feat: add incident.io integration to alertmanager prometheus-operator/prometheus-operator#8014

Open

5 tasks

swythan mentioned this pull request Nov 25, 2025

Support incident.io receiver in VMAlertmanagerConfig VictoriaMetrics/operator#1637

Closed

toby-archer-tr mentioned this pull request Dec 12, 2025

Cherry-pick incident.io integration grafana/prometheus-alertmanager#127

Open

duncangrist mentioned this pull request Feb 13, 2026

fix(deps): update module github.com/prometheus/alertmanager to v0.30.0 (main) - autoclosed grafana/mimir#13396

Closed

1 task

leonore mentioned this pull request Feb 20, 2026

incident.io notifier: add metadata field with templating support #5022

Draft

10 tasks

Conversation

rorymalcolm commented Apr 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

axdotl Apr 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grobinson-grafana commented May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rorymalcolm commented May 7, 2025

Uh oh!

grobinson-grafana commented May 8, 2025

Uh oh!

grobinson-grafana commented May 10, 2025

Uh oh!

rorymalcolm commented May 12, 2025

Uh oh!

rorymalcolm commented May 23, 2025

Uh oh!

grobinson-grafana commented May 23, 2025

Uh oh!

axdotl commented Jun 10, 2025

Uh oh!

ankitdh7 commented Jun 20, 2025

Uh oh!

Uh oh!

grobinson-grafana commented Jul 8, 2025

Uh oh!

grobinson-grafana commented Jul 22, 2025

Uh oh!

grobinson-grafana left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grobinson-grafana Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

isaacseymour Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

grobinson-grafana commented Jul 22, 2025

Uh oh!

grobinson-grafana left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grobinson-grafana Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

isaacseymour Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

isaacseymour commented Aug 6, 2025

Uh oh!

isaacseymour commented Aug 27, 2025

Uh oh!

paulyehorov commented Sep 9, 2025

Uh oh!

gotjosh commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

isaacseymour commented Sep 22, 2025

Uh oh!

gotjosh commented Sep 22, 2025

Uh oh!

rorymalcolm commented Apr 27, 2025 •

edited

Loading

grobinson-grafana commented May 3, 2025 •

edited

Loading

gotjosh commented Sep 22, 2025 •

edited

Loading