Conversation
98ed08f to
416cac1
Compare
118a310 to
4bc00a7
Compare
| } | ||
|
|
||
| // IncidentioConfig configures notifications via incident.io. | ||
| type IncidentioConfig struct { |
There was a problem hiding this comment.
What do you think about adding a field Metadata to this struct? Something similar to the Details field in the OpsgenieConfig struct.
This would enable users to define additional data (see incident.io API definition for reference).
17d6a6f to
255befe
Compare
|
You have some lint failures in |
8f16fcd to
bbc39ce
Compare
|
You still have some failing tests I'm afraid https://github.com/prometheus/alertmanager/actions/runs/14933150149/job/41982648209?pr=4372 |
bbc39ce to
07604e3
Compare
Ah - apols; fixed! |
|
Hey @grobinson-grafana - just confirming this is good to go? |
|
I haven't had time to review the latest changes, but I hope to be able to do that over the weekend. If the docs are looking good and code + tests still look good I hope to merge this over the weekend. |
|
Hello @grobinson-grafana - any chance to get this merged soon? Looking forward to make use of it. |
|
Any chance to get this merge soon? It will be a great improvement over webhook. |
|
Just heads up, we have an issue with Circle CI being broken for Alertmanager and are figuring out with other Prometheus contributors how to fix it. |
7f60f95 to
34eac10
Compare
|
Hi! Our CI is fixed, so will do a final review and then merge this. Thanks for your patience, I know this has been open since April. |
grobinson-grafana
left a comment
There was a problem hiding this comment.
Fixed the lint error, but I think there is a bug where it doesn't inherit the global HTTP config from the configuration file, and some of the truncation doesn't work.
| } | ||
|
|
||
| // encodeMessage encodes the message and truncates content if it exceeds maxPayloadSize. | ||
| func (n *Notifier) encodeMessage(msg *Message) (bytes.Buffer, error) { |
There was a problem hiding this comment.
I think the step based reduction is clever, but I am a bit worried about the number of times we marshal the message when reducing its final size. JSON Marshal in Go is expensive from both the perspective of CPU and allocations. I appreciate in the worst case its effectively O(logN) marshals as you half the number of alerts each time, but I'm also thinking about this from the perspective of multi-tenant projects like Cortex and Mimir that might have 1000s or 10,000s of tenants "sharing" an Alertmanager. It's hard to say this is fine without any data to prove it.
Have you tried keeping a running estimate on the size instead to avoid remarshalling? In contrast, adding ints is one of the fastest operations you can do on a CPU, so it will have a massive difference in overhead. If you want to take into account the byte overhead of the JSON syntax (braces, commas, etc), you can estimate this as 5% of the final payload size.
There was a problem hiding this comment.
I had a chat with Rory and decided to just go for a much simpler implementation here: if the full message is too large, drop all but the first alert and try that.
I've written up the full reasoning in the commit, but I think this is both more efficient (generally 1 encoding, occasionally 2), and a better experience for the humans involved: no strange mangled alerts coming through, just some additional info dropped.
- Adds the technical implementation, and tests, for the incident.io notifier
- Configured through the following config:
```yaml
receivers:
- name: 'incidentio-notifications'
incidentio_configs:
- url: '$alert_source_url'
alert_source_token: '$alert_source_token'
```
- Add documentation for the incidentio_config
Signed-off-by: Isaac Seymour <i.seymour@oxon.org>
7f60f95 to
34eac10
Compare
|
Hi! Our CI is fixed, so will do a final review and then merge this. Thanks for your patience, I know this has been open since April. |
grobinson-grafana
left a comment
There was a problem hiding this comment.
Fixed the lint error, but I think there is a bug where it doesn't inherit the global HTTP config from the configuration file, and some of the truncation doesn't work.
| } | ||
|
|
||
| // encodeMessage encodes the message and truncates content if it exceeds maxPayloadSize. | ||
| func (n *Notifier) encodeMessage(msg *Message) (bytes.Buffer, error) { |
There was a problem hiding this comment.
I think the step based reduction is clever, but I am a bit worried about the number of times we marshal the message when reducing its final size. JSON Marshal in Go is expensive from both the perspective of CPU and allocations. I appreciate in the worst case its effectively O(logN) marshals as you half the number of alerts each time, but I'm also thinking about this from the perspective of multi-tenant projects like Cortex and Mimir that might have 1000s or 10,000s of tenants "sharing" an Alertmanager. It's hard to say this is fine without any data to prove it.
Have you tried keeping a running estimate on the size instead to avoid remarshalling? In contrast, adding ints is one of the fastest operations you can do on a CPU, so it will have a massive difference in overhead. If you want to take into account the byte overhead of the JSON syntax (braces, commas, etc), you can estimate this as 5% of the final payload size.
There was a problem hiding this comment.
I had a chat with Rory and decided to just go for a much simpler implementation here: if the full message is too large, drop all but the first alert and try that.
I've written up the full reasoning in the commit, but I think this is both more efficient (generally 1 encoding, occasionally 2), and a better experience for the humans involved: no strange mangled alerts coming through, just some additional info dropped.
- Adds the technical implementation, and tests, for the incident.io notifier
- Configured through the following config:
```yaml
receivers:
- name: 'incidentio-notifications'
incidentio_configs:
- url: '$alert_source_url'
alert_source_token: '$alert_source_token'
```
- Add documentation for the incidentio_config
Signed-off-by: Isaac Seymour <i.seymour@oxon.org>
d0be1dc to
0973fd3
Compare
|
Hey @grobinson-grafana - I believe this should be ready to go at this point! |
Rather than carefully trying to shrink the size of the payload to fit the 512kB limit, just try two encodings: 1. The full original message; and 2. Remove all but the first alert in the group and send that For most configurations, each message creates a single alert in incident.io, with the details of the alerts which made up that group contained within being useful but not essential. This means the code is a lot simpler, and does at-most 2 JSON encodings for each message (although in general it should be pretty rare to need to do more than 1!) Signed-off-by: Isaac Seymour <i.seymour@oxon.org>
0973fd3 to
0d86b5b
Compare
|
👋 Hey @grobinson-grafana, would love to get this merged if you have time to review! 🙏 |
|
Hey folks, |
|
I'm not sure I follow what's going on - but I'm not able to add commits to your pull request directly despite having To speed things up, I've made the changes myself (as some of them are a bit pedantic e.g. ending with dots (.) on comments). I have attached the diff of both commits I've made - I'd appreciate it if you could incorporate them and let me know your thoughts. linter-fix.diff.txt Most of the changes are purely cosmetic but do let me know if you have any questions on them - if you can please incorporate them I'll be happy to merge this. |
Signed-off-by: Isaac Seymour <i.seymour@oxon.org>
583bc7e to
c22bd50
Compare
Signed-off-by: gotjosh <josue.abreu@gmail.com>
It does allow me to do anything through the UI. I'm not sure why is not allowing me via other means Anyways, I've double checked @grobinson-grafana's comments and they all seem to be addressed now so this LGTM. |
|
As for the release, I'm aiming to set some time aside next week to review a few PRs, get them in and then do the release with @grobinson-grafana. |
|
Thank you very much for your contribution @isaacseymour and @rorymalcolm ❤️ |
Closes #4367
Adds the technical implementation, and tests, for the incident.io notifier
Configured through the following config: