Add observability alerts for chargeback integration#16205
Add observability alerts for chargeback integration#16205JohannesMahne merged 6 commits intojohannes-chargeback-wipfrom
Conversation
- Add two ES|QL alerting rules: detect new chargeback groups and detect deployments missing usage data - Add comprehensive documentation for alert setup and configuration - Update Elasticsearch version requirement to 9.2.0+ for smart lookup join support - Add transform startup and monitoring instructions to README
There was a problem hiding this comment.
Pull request overview
This PR adds observability alerting capabilities to the chargeback integration and updates documentation to clarify setup requirements and procedures. The changes enhance monitoring of chargeback data quality and provide comprehensive guidance for users setting up the integration.
Key changes:
- Addition of two ES|QL alert rule templates for detecting new chargeback groups and missing usage data
- Comprehensive documentation updates including transform startup instructions, deployment group configuration, and alert setup procedures
- Elasticsearch version requirement update from 8.18.0+ to 9.2.0+ for smart lookup join support
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| packages/chargeback/manifest.yml | Version bump to 0.2.7 |
| packages/chargeback/elasticsearch/transform/**/transform.yml | Pipeline version updates to match new package version |
| packages/chargeback/docs/README.md | Major documentation expansion with alert templates, transform instructions, and clarified requirements |
| packages/chargeback/changelog.yml | Changelog entry for version 0.2.7 with corrected historical entry types |
| packages/chargeback/_dev/build/docs/README.md | Generated documentation file mirroring README.md changes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
sholzhauer-es
left a comment
There was a problem hiding this comment.
LGTM, good improvements to the readme!
| All configuration values can be updated via the configuration lookup index: | ||
|
|
||
| ``` | ||
| POST chargeback_conf_lookup/_update/config |
There was a problem hiding this comment.
This only works if there is a single record. If you have multiple the doc id used needs to change.
There was a problem hiding this comment.
Good point. will update it.
💔 Build Failed
Failed CI StepsHistory
|
#17167) * WIP: early chargeback code for review * Working config integration - 0.0.2 * Version 0.0.3: working from Stack monitoring data * Fixed query for one visualisation * Update instructions * Working with the correct alias * Changes to transforms * Bug fix: Fix sorting on visualisation. * Update setup instructions * 0.1.0: Adding ECU value (normalised cost). * Bug: Aligned fields returned to field names used in visualisation * Fixing bug: aligning esql returned field names with field names used in lens * move to packages * not starting transforms on integration installation * Update version number * Made sure the colour palette is predictable by using the eui_amsterdam_color_blind palate. Add ECU rate to the dashboard. * Update sequence and comments on pre-setup to promote ES integration * Consistent naming of datastream. Add LIMIT 5000 to ESQL top query to cater for large organisations. * Add correct code owner * Delete wrong test files * Updated the directory structure to remove superfluous directory * Rem reference to sample logs and logos * Switch off dynamic mappings for the results of the transforms - we know exactly what the output be. * Removed agent folders in data stream, as it is not used. * Updated the readme file to refer to integration, rather than module. Also added explanation about the rest of the config. * Re-add image * Formatting * NOT WORKING: settings index.mode: lookup is not supported * Fixing the control error in the dashboard by adding a data view. * Updated to push back usage data transform to ES Integration * Updated readme * Update transfrom version numbers * Swap the use of deployment_id or deployment name to a concatenation of both, to make it easier to identify the deployment in the dashboard. * Make use of the new elastic-package version, which will create the lookup index automatically when installing the package. * Update version number * Updated pre-setup, and version number * Adding casting to double for division to avoid null instead of very small numbers. elastic/elasticsearch-chargeback#50 * Update version * Allowing for setting converion rate per time window * fixing pipeline versions * adding pipeline stuff * correcting version * [Chargeback] Dashboard control and Dataview (#16153) * dashboard control * updating version + DV * SKU based chargeback (#16182) * adding sku and cost_type to billing data for node granularity * working on sku with pipeline to do parsing * downplaying version * transform * Chargeback Integration: Extract deployment group from Billing tags (#16185) * Add deployment_group extraction from ESS Billing tags - Extract chargeback_group tag value to deployment_group field in billing pipeline - Add deployment_group to billing_cluster_cost transform group_by - Add deployment_group field definition - Fix transforms to use elasticsearch.cluster.name without .keyword - Update changelog for v0.2.4 * Add deployment_group extraction using runtime mappings from ESS Billing tags * Update dashboard with deployment_group filter and definitions * Bump version to 0.2.5 for deployment_group feature after merging SKU/cost_type changes * known [bug](elastic/elasticsearch-chargeback#60) from 0.2.4 * Fixing bug introduced in 0.2.4 (#16192) * adding sku and cost_type to billing data for node granularity * working on sku with pipeline to do parsing * downplaying version * transform * merge * dashboard * setting version * Add observability alerts for chargeback integration (#16205) * Add observability alerts for chargeback integration - Add two ES|QL alerting rules: detect new chargeback groups and detect deployments missing usage data - Add comprehensive documentation for alert setup and configuration - Update Elasticsearch version requirement to 9.2.0+ for smart lookup join support - Add transform startup and monitoring instructions to README * Update changelog with PR #16205 * Remove wrong information * Update chargeback README documentation * Improve observability alert action message formatting * Clarify configuration update vs add new period documentation * Fix mustache template escaping in alert actions documentation * [Chargeback] Alerting rule (#16229) * Add alerting rule templates and enable auto-start for all transforms - Add 3 Kibana alerting rule templates: - Transform health monitoring for all Chargeback transforms - New chargeback group detection - Deployment with missing usage data detection - Enable auto-start for all transforms (start: true in manifests) - Update transform pipeline references to version 0.2.8 - Add performance warning about initial transform execution - Update README with alerting documentation - Bump package version to 0.2.8 * Fix: Revert transform frequencies back to 60m * Update PR number in changelog * Chargeback css (#16326) * WIP: early chargeback code for review * Working config integration - 0.0.2 * Version 0.0.3: working from Stack monitoring data * Fixed query for one visualisation * Update instructions * Working with the correct alias * Changes to transforms * Bug fix: Fix sorting on visualisation. * Update setup instructions * 0.1.0: Adding ECU value (normalised cost). * Bug: Aligned fields returned to field names used in visualisation * Fixing bug: aligning esql returned field names with field names used in lens * move to packages * not starting transforms on integration installation * Update version number * Made sure the colour palette is predictable by using the eui_amsterdam_color_blind palate. Add ECU rate to the dashboard. * Update sequence and comments on pre-setup to promote ES integration * Consistent naming of datastream. Add LIMIT 5000 to ESQL top query to cater for large organisations. * Add correct code owner * Delete wrong test files * Updated the directory structure to remove superfluous directory * Rem reference to sample logs and logos * Switch off dynamic mappings for the results of the transforms - we know exactly what the output be. * Removed agent folders in data stream, as it is not used. * Updated the readme file to refer to integration, rather than module. Also added explanation about the rest of the config. * Re-add image * Formatting * NOT WORKING: settings index.mode: lookup is not supported * Fixing the control error in the dashboard by adding a data view. * Updated to push back usage data transform to ES Integration * Updated readme * Update transfrom version numbers * Swap the use of deployment_id or deployment name to a concatenation of both, to make it easier to identify the deployment in the dashboard. * Make use of the new elastic-package version, which will create the lookup index automatically when installing the package. * Update version number * Updated pre-setup, and version number * Adding casting to double for division to avoid null instead of very small numbers. elastic/elasticsearch-chargeback#50 * Update version * Allowing for setting converion rate per time window * fixing pipeline versions * adding pipeline stuff * correcting version * [Chargeback] Dashboard control and Dataview (#16153) * dashboard control * updating version + DV * SKU based chargeback (#16182) * adding sku and cost_type to billing data for node granularity * working on sku with pipeline to do parsing * downplaying version * transform * Chargeback Integration: Extract deployment group from Billing tags (#16185) * Add deployment_group extraction from ESS Billing tags - Extract chargeback_group tag value to deployment_group field in billing pipeline - Add deployment_group to billing_cluster_cost transform group_by - Add deployment_group field definition - Fix transforms to use elasticsearch.cluster.name without .keyword - Update changelog for v0.2.4 * Add deployment_group extraction using runtime mappings from ESS Billing tags * Update dashboard with deployment_group filter and definitions * Bump version to 0.2.5 for deployment_group feature after merging SKU/cost_type changes * known [bug](elastic/elasticsearch-chargeback#60) from 0.2.4 * wip on css * adding "local" cluster for ones without remote clusters --------- Co-authored-by: Johannes Mahne <johannes.mahne@elastic.co> * Fix: Correct PR number for CSS changes in changelog (0.2.9) * Fix visualizations not loading by adding TO_DOUBLE type conversion - Add TO_DOUBLE() wrapper to all division operations in ESQL queries - Prevents integer division from returning zero - Fixes tier_sum_indexing_time / deployment_sum_indexing_time - Fixes tier_sum_query_time / deployment_sum_query_time - Fixes tier_sum_data_set_store_size / deployment_sum_data_set_store_size - Fixes tier_sum_store_size / deployment_sum_store_size - Bump version to 0.2.10 Fixes: elastic/elasticsearch-chargeback#69 * Add automated chargeback_conf_lookup index creation via transform - Add bootstrap transform that creates chargeback_conf_lookup index with default config - Uses cluster_deployment_contribution_lookup as source - Sets default values: ECU rate 0.85 EUR, weights 20/20/40 - Date range: 2010-01-01 (ES birthdate) to 2046-12-31 - Eliminates need for manual index creation * Bump transform pipeline versions to 0.2.10 - Update pipeline references from 0.2.9 to 0.2.10 - Revert billing_cluster_cost sync field to event.ingested (was temporarily @timestamp) * Removed now redundant pre-setup. * Update PR number in changelog, and recover billing cost sync time field. * Add On-Premises Billing integration v0.1.0 Initial release of the On-Premises Billing integration that generates ESS Billing-compatible metrics for on-premises, ECE, and ECK deployments, enabling the Chargeback integration to work in non-cloud environments. Features: - Fixed daily ECU model per deployment - Bootstrap transform for auto-discovery of deployments - Configurable deployment_tags for Chargeback grouping - Manual enrich policy setup for deployment configuration - Output to metrics-ess_billing.billing-onprem index Requires manual post-installation setup (enrich policy, pipeline, transform start). * Add mERU (milli-ERU) cost model to on-prem billing integration Introduces milli-ERU as the internal cost unit (1 ERU = 1000 mERU) to handle fractional ERU allocations cleanly. This allows deployments ranging from 0.25 ERU to 10+ ERU without awkward decimals or large rate multipliers. Changes: - Add organization-level config (license cost, total ERUs, ERU-to-RAM ratio) - Support per-deployment config via direct ERU input OR RAM-based calculation - Update config_bootstrap transform for mERU defaults - Comprehensive documentation on gathering config data and calculations WIP: Preliminary implementation - needs testing * On-Prem Billing: 0.2.0 ERU config, deployment_tags, docs and transform alignment * Update packages/chargeback/elasticsearch/transform/chargeback_conf_lookup/fields/base-fields.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update packages/chargeback/elasticsearch/transform/cluster_deployment_contribution/transform.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update packages/chargeback/elasticsearch/transform/cluster_tier_contribution/transform.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update packages/chargeback/elasticsearch/transform/cluster_tier_and_ds_contribution/transform.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update packages/chargeback/elasticsearch/transform/cluster_datastream_contribution/transform.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Stijn Holzhauer <stijn.holzhauer@elastic.co> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…, deployment groups (#17170) * WIP: early chargeback code for review * Working config integration - 0.0.2 * Version 0.0.3: working from Stack monitoring data * Fixed query for one visualisation * Update instructions * Working with the correct alias * Changes to transforms * Bug fix: Fix sorting on visualisation. * Update setup instructions * 0.1.0: Adding ECU value (normalised cost). * Bug: Aligned fields returned to field names used in visualisation * Fixing bug: aligning esql returned field names with field names used in lens * move to packages * not starting transforms on integration installation * Update version number * Made sure the colour palette is predictable by using the eui_amsterdam_color_blind palate. Add ECU rate to the dashboard. * Update sequence and comments on pre-setup to promote ES integration * Consistent naming of datastream. Add LIMIT 5000 to ESQL top query to cater for large organisations. * Add correct code owner * Delete wrong test files * Updated the directory structure to remove superfluous directory * Rem reference to sample logs and logos * Switch off dynamic mappings for the results of the transforms - we know exactly what the output be. * Removed agent folders in data stream, as it is not used. * Updated the readme file to refer to integration, rather than module. Also added explanation about the rest of the config. * Re-add image * Formatting * NOT WORKING: settings index.mode: lookup is not supported * Fixing the control error in the dashboard by adding a data view. * Updated to push back usage data transform to ES Integration * Updated readme * Update transfrom version numbers * Swap the use of deployment_id or deployment name to a concatenation of both, to make it easier to identify the deployment in the dashboard. * Make use of the new elastic-package version, which will create the lookup index automatically when installing the package. * Update version number * Updated pre-setup, and version number * Adding casting to double for division to avoid null instead of very small numbers. elastic/elasticsearch-chargeback#50 * Update version * Allowing for setting converion rate per time window * fixing pipeline versions * adding pipeline stuff * correcting version * [Chargeback] Dashboard control and Dataview (#16153) * dashboard control * updating version + DV * SKU based chargeback (#16182) * adding sku and cost_type to billing data for node granularity * working on sku with pipeline to do parsing * downplaying version * transform * Chargeback Integration: Extract deployment group from Billing tags (#16185) * Add deployment_group extraction from ESS Billing tags - Extract chargeback_group tag value to deployment_group field in billing pipeline - Add deployment_group to billing_cluster_cost transform group_by - Add deployment_group field definition - Fix transforms to use elasticsearch.cluster.name without .keyword - Update changelog for v0.2.4 * Add deployment_group extraction using runtime mappings from ESS Billing tags * Update dashboard with deployment_group filter and definitions * Bump version to 0.2.5 for deployment_group feature after merging SKU/cost_type changes * known [bug](elastic/elasticsearch-chargeback#60) from 0.2.4 * Fixing bug introduced in 0.2.4 (#16192) * adding sku and cost_type to billing data for node granularity * working on sku with pipeline to do parsing * downplaying version * transform * merge * dashboard * setting version * Add observability alerts for chargeback integration (#16205) * Add observability alerts for chargeback integration - Add two ES|QL alerting rules: detect new chargeback groups and detect deployments missing usage data - Add comprehensive documentation for alert setup and configuration - Update Elasticsearch version requirement to 9.2.0+ for smart lookup join support - Add transform startup and monitoring instructions to README * Update changelog with PR #16205 * Remove wrong information * Update chargeback README documentation * Improve observability alert action message formatting * Clarify configuration update vs add new period documentation * Fix mustache template escaping in alert actions documentation * [Chargeback] Alerting rule (#16229) * Add alerting rule templates and enable auto-start for all transforms - Add 3 Kibana alerting rule templates: - Transform health monitoring for all Chargeback transforms - New chargeback group detection - Deployment with missing usage data detection - Enable auto-start for all transforms (start: true in manifests) - Update transform pipeline references to version 0.2.8 - Add performance warning about initial transform execution - Update README with alerting documentation - Bump package version to 0.2.8 * Fix: Revert transform frequencies back to 60m * Update PR number in changelog * Chargeback css (#16326) * WIP: early chargeback code for review * Working config integration - 0.0.2 * Version 0.0.3: working from Stack monitoring data * Fixed query for one visualisation * Update instructions * Working with the correct alias * Changes to transforms * Bug fix: Fix sorting on visualisation. * Update setup instructions * 0.1.0: Adding ECU value (normalised cost). * Bug: Aligned fields returned to field names used in visualisation * Fixing bug: aligning esql returned field names with field names used in lens * move to packages * not starting transforms on integration installation * Update version number * Made sure the colour palette is predictable by using the eui_amsterdam_color_blind palate. Add ECU rate to the dashboard. * Update sequence and comments on pre-setup to promote ES integration * Consistent naming of datastream. Add LIMIT 5000 to ESQL top query to cater for large organisations. * Add correct code owner * Delete wrong test files * Updated the directory structure to remove superfluous directory * Rem reference to sample logs and logos * Switch off dynamic mappings for the results of the transforms - we know exactly what the output be. * Removed agent folders in data stream, as it is not used. * Updated the readme file to refer to integration, rather than module. Also added explanation about the rest of the config. * Re-add image * Formatting * NOT WORKING: settings index.mode: lookup is not supported * Fixing the control error in the dashboard by adding a data view. * Updated to push back usage data transform to ES Integration * Updated readme * Update transfrom version numbers * Swap the use of deployment_id or deployment name to a concatenation of both, to make it easier to identify the deployment in the dashboard. * Make use of the new elastic-package version, which will create the lookup index automatically when installing the package. * Update version number * Updated pre-setup, and version number * Adding casting to double for division to avoid null instead of very small numbers. elastic/elasticsearch-chargeback#50 * Update version * Allowing for setting converion rate per time window * fixing pipeline versions * adding pipeline stuff * correcting version * [Chargeback] Dashboard control and Dataview (#16153) * dashboard control * updating version + DV * SKU based chargeback (#16182) * adding sku and cost_type to billing data for node granularity * working on sku with pipeline to do parsing * downplaying version * transform * Chargeback Integration: Extract deployment group from Billing tags (#16185) * Add deployment_group extraction from ESS Billing tags - Extract chargeback_group tag value to deployment_group field in billing pipeline - Add deployment_group to billing_cluster_cost transform group_by - Add deployment_group field definition - Fix transforms to use elasticsearch.cluster.name without .keyword - Update changelog for v0.2.4 * Add deployment_group extraction using runtime mappings from ESS Billing tags * Update dashboard with deployment_group filter and definitions * Bump version to 0.2.5 for deployment_group feature after merging SKU/cost_type changes * known [bug](elastic/elasticsearch-chargeback#60) from 0.2.4 * wip on css * adding "local" cluster for ones without remote clusters --------- Co-authored-by: Johannes Mahne <johannes.mahne@elastic.co> * Fix: Correct PR number for CSS changes in changelog (0.2.9) * [Chargeback] Fix chargeback visualizations and add automated config lookup (#16936) * Fix visualizations not loading by adding TO_DOUBLE type conversion - Add TO_DOUBLE() wrapper to all division operations in ESQL queries - Prevents integer division from returning zero - Fixes tier_sum_indexing_time / deployment_sum_indexing_time - Fixes tier_sum_query_time / deployment_sum_query_time - Fixes tier_sum_data_set_store_size / deployment_sum_data_set_store_size - Fixes tier_sum_store_size / deployment_sum_store_size - Bump version to 0.2.10 Fixes: elastic/elasticsearch-chargeback#69 * Add automated chargeback_conf_lookup index creation via transform - Add bootstrap transform that creates chargeback_conf_lookup index with default config - Uses cluster_deployment_contribution_lookup as source - Sets default values: ECU rate 0.85 EUR, weights 20/20/40 - Date range: 2010-01-01 (ES birthdate) to 2046-12-31 - Eliminates need for manual index creation * Bump transform pipeline versions to 0.2.10 - Update pipeline references from 0.2.9 to 0.2.10 - Revert billing_cluster_cost sync field to event.ingested (was temporarily @timestamp) * Removed now redundant pre-setup. * Update PR number in changelog, and recover billing cost sync time field. * Update packages/chargeback/elasticsearch/transform/chargeback_conf_lookup/fields/base-fields.yml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix chargeback_conf_lookup transform source index dependency Change source from cluster_deployment_contribution_lookup to metrics-ess_billing.billing-* to fix transform loading in fresh setups. The transform uses runtime_mappings to generate all config values, so the source index content doesn't matter - it only needs any document to trigger. Billing data is guaranteed to exist in chargeback deployments. Fixes feedback from @elastic-abhi review. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Chargeback dashboard: ECU/ERU wording in Configuration Information section (#17168) * Chargeback 0.3.0: chargeable units (ECU/ERU), bump manifest and transform pipeline/versions * Apply suggestion from @Copilot Spell correction Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Spell correction Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Spell correction Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Spell correction Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Stijn Holzhauer <stijn.holzhauer@elastic.co> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Abhi <252589011+elastic-abhi@users.noreply.github.com>
Proposed commit message
Checklist
changelog.ymlfile.Author's Checklist
How to test this PR locally
Related issues
Screenshots