Summary
Sub-issue of #255799.
Our SLOs are detecting the following error:
Can not create edge >rabbitmq/rmdata.Web.Common.Dto:PluginProtocol~rmdata.Web.Service with nonexistant source >rabbitmq/rmdata.Web.Common.Dto:PluginProtocol
Investigation
Error details
| Field |
Value |
| Error type |
Error |
| Error label |
PageFatalReactError |
| Page |
/app/apm/service-map |
| Route pattern |
/service-map |
| Service |
kibana-frontend v9.4.0 |
| Origin |
apm plugin chunk (apm.chunk.5984.js, functions Yo, restore, new Gm, add) |
| Environment |
Production (Serverless Observability, Azure northeurope) |
| Browser |
Chrome 145 (Windows) |
| Feature flag |
flag_apm_serviceMapUseReactFlow: false (using legacy Cytoscape-based service map) |
Root cause analysis
The APM service map page crashes when the graph library (Cytoscape.js) attempts to add an edge between two nodes, but the source node does not exist in the graph. The error message is explicit: it cannot create an edge from >rabbitmq/rmdata.Web.Common.Dto:PluginProtocol to rmdata.Web.Service because the source node >rabbitmq/rmdata.Web.Common.Dto:PluginProtocol has not been added to the graph.
This is a data consistency issue between the service map API response and the graph construction logic. The backend returns a list of nodes (services and external dependencies) and a list of edges (connections between them). When an edge references a node ID that is not present in the node list, Cytoscape.js throws an unrecoverable error.
The problematic node ID contains special characters -- slashes (/), dots (.), and colons (:) -- from a RabbitMQ queue/exchange name (rmdata.Web.Common.Dto:PluginProtocol). This suggests the node ID generation or matching logic may be mishandling complex dependency names, either by:
- Filtering out the node but keeping the edge: The backend or frontend filters/deduplicates external dependency nodes differently from how it generates edges, leaving orphaned edge references.
- ID mismatch due to encoding/escaping: The node ID and the edge's source reference are generated through different code paths that encode special characters differently, so they don't match.
- External dependency grouping inconsistency: External dependencies (like RabbitMQ queues) are grouped under a parent node (e.g.,
>rabbitmq) on the node side, but the edge references the specific sub-resource ID that was not materialized as a separate node.
The user was viewing the service map with ENVIRONMENT_ALL across a 15-day range (rangeFrom=now-15d), which increases the likelihood of surfacing complex inter-service topologies with messaging queues and unusual dependency names.
Stacktrace analysis
| Frame |
File |
Function |
Role |
| 1 |
apm.chunk.5984.js |
Yo |
APM service map code that triggers the graph error -- likely the edge validation/error throw |
| 2 |
apm.chunk.5984.js |
restore |
Cytoscape.js restore method -- restores elements to the graph after batch construction |
| 3 |
apm.chunk.5984.js |
new Gm |
Cytoscape.js graph model constructor -- validates edges during graph initialization |
| 4 |
apm.chunk.5984.js |
add |
Cytoscape.js add method -- adds elements (nodes + edges) to the graph |
| 5 |
apm.chunk.5984.js |
<anonymous> |
APM component that calls cy.add(elements) to populate the service map |
| 6-10 |
kbn-ui-shared-deps-npm.dll.js |
Pl → tc → Fu → ii → Pu |
React render/commit cycle and effect scheduling |
All five APM frames are in the same chunk (apm.chunk.5984.js), which bundles the Cytoscape.js library together with the service map component. The call flow is: React effect (frame 5) calls cy.add() (frame 4) which constructs graph models (frame 3), restores elements (frame 2), and validates edge sources (frame 1), throwing when the source node is missing.
Suspect areas (prioritized)
| Priority |
Location |
Reason |
| 1 |
Service map element transformation (nodes/edges generation) |
The code that transforms the /api/apm/service-map API response into Cytoscape elements likely produces edges whose source/target IDs don't match the generated node IDs, especially for external dependencies with complex names containing /, ., and :. |
| 2 |
Service map API backend (node/edge generation) |
The backend may return edge data referencing granular external dependency sub-resources (e.g., specific RabbitMQ queues) while only returning a grouped parent node (e.g., >rabbitmq), creating a mismatch. |
| 3 |
External dependency node ID encoding |
The > prefix convention for external dependencies combined with the / and : characters in the queue name may cause inconsistent ID generation between the node-creation and edge-creation code paths. |
Key files
x-pack/solutions/observability/plugins/apm/public/components/app/service_map/ -- service map React component and Cytoscape integration
x-pack/solutions/observability/plugins/apm/public/components/app/service_map/cytoscape_options.ts -- Cytoscape configuration and element handling
x-pack/solutions/observability/plugins/apm/server/routes/service_map/ -- backend route that returns nodes and edges for the service map
x-pack/solutions/observability/plugins/apm/common/service_map.ts -- shared types and ID generation utilities for service map elements
Suggested fixes
- Validate edges before adding to graph: Before calling
cy.add(elements), filter out any edges whose source or target node ID is not present in the node set. Log a warning for discarded edges instead of crashing the page.
- Ensure node/edge ID consistency: Audit the code paths that generate node IDs and edge source/target references for external dependencies, ensuring they produce identical strings for the same dependency regardless of special characters.
- Add missing dependency nodes as fallback: If an edge references a node that doesn't exist, auto-create a placeholder node for that dependency rather than crashing. This is a defensive approach that preserves the graph topology.
Reproduction
- Open a serverless observability project with services that communicate via RabbitMQ (or other messaging queues with complex names containing
/, ., :)
- Navigate to
/app/apm/service-map
- Set environment to
ENVIRONMENT_ALL and a broad time range (e.g., 15 days)
- The page crashes when Cytoscape attempts to render edges for messaging queue dependencies whose node IDs don't match
The specific URL from the error document:
/app/apm/service-map?comparisonEnabled=true&environment=ENVIRONMENT_ALL&kuery=&offset=1296000000ms&rangeFrom=now-15d&rangeTo=now&serviceGroup=
Additional context
- The
flag_apm_serviceMapUseReactFlow feature flag is false, meaning this user is on the legacy Cytoscape-based service map. If the ReactFlow-based service map handles edge validation differently, this error may not occur when that flag is enabled.
- The transaction type is
user-interaction, and error.custom.classes contains euiFieldNumber css-vehtd3-euiFieldNumber-compressed, suggesting the error was triggered after the user interacted with a numeric input field on the page (possibly adjusting the time range or comparison offset), which caused a service map re-render with updated data.
- The same project (
rmdata_dev_serverless) and organization also appears in other SLO errors, suggesting this environment has a complex service topology that exercises edge cases in the APM UI.
- The error is tagged
PageFatalReactError, meaning it crashes the entire page for the user.
- Observed on Kibana
9.4.0 (git rev d83e0ba465ad) in an Azure northeurope serverless observability project.
Summary
Sub-issue of #255799.
Our SLOs are detecting the following error:
Investigation
Error details
ErrorPageFatalReactError/app/apm/service-map/service-mapkibana-frontendv9.4.0apmplugin chunk (apm.chunk.5984.js, functionsYo,restore,new Gm,add)flag_apm_serviceMapUseReactFlow: false(using legacy Cytoscape-based service map)Root cause analysis
The APM service map page crashes when the graph library (Cytoscape.js) attempts to add an edge between two nodes, but the source node does not exist in the graph. The error message is explicit: it cannot create an edge from
>rabbitmq/rmdata.Web.Common.Dto:PluginProtocoltormdata.Web.Servicebecause the source node>rabbitmq/rmdata.Web.Common.Dto:PluginProtocolhas not been added to the graph.This is a data consistency issue between the service map API response and the graph construction logic. The backend returns a list of nodes (services and external dependencies) and a list of edges (connections between them). When an edge references a node ID that is not present in the node list, Cytoscape.js throws an unrecoverable error.
The problematic node ID contains special characters -- slashes (
/), dots (.), and colons (:) -- from a RabbitMQ queue/exchange name (rmdata.Web.Common.Dto:PluginProtocol). This suggests the node ID generation or matching logic may be mishandling complex dependency names, either by:>rabbitmq) on the node side, but the edge references the specific sub-resource ID that was not materialized as a separate node.The user was viewing the service map with
ENVIRONMENT_ALLacross a 15-day range (rangeFrom=now-15d), which increases the likelihood of surfacing complex inter-service topologies with messaging queues and unusual dependency names.Stacktrace analysis
apm.chunk.5984.jsYoapm.chunk.5984.jsrestorerestoremethod -- restores elements to the graph after batch constructionapm.chunk.5984.jsnew Gmapm.chunk.5984.jsaddaddmethod -- adds elements (nodes + edges) to the graphapm.chunk.5984.js<anonymous>cy.add(elements)to populate the service mapkbn-ui-shared-deps-npm.dll.jsPl→tc→Fu→ii→PuAll five APM frames are in the same chunk (
apm.chunk.5984.js), which bundles the Cytoscape.js library together with the service map component. The call flow is: React effect (frame 5) callscy.add()(frame 4) which constructs graph models (frame 3), restores elements (frame 2), and validates edge sources (frame 1), throwing when the source node is missing.Suspect areas (prioritized)
/api/apm/service-mapAPI response into Cytoscape elements likely produces edges whose source/target IDs don't match the generated node IDs, especially for external dependencies with complex names containing/,., and:.>rabbitmq), creating a mismatch.>prefix convention for external dependencies combined with the/and:characters in the queue name may cause inconsistent ID generation between the node-creation and edge-creation code paths.Key files
x-pack/solutions/observability/plugins/apm/public/components/app/service_map/-- service map React component and Cytoscape integrationx-pack/solutions/observability/plugins/apm/public/components/app/service_map/cytoscape_options.ts-- Cytoscape configuration and element handlingx-pack/solutions/observability/plugins/apm/server/routes/service_map/-- backend route that returns nodes and edges for the service mapx-pack/solutions/observability/plugins/apm/common/service_map.ts-- shared types and ID generation utilities for service map elementsSuggested fixes
cy.add(elements), filter out any edges whose source or target node ID is not present in the node set. Log a warning for discarded edges instead of crashing the page.Reproduction
/,.,:)/app/apm/service-mapENVIRONMENT_ALLand a broad time range (e.g., 15 days)The specific URL from the error document:
Additional context
flag_apm_serviceMapUseReactFlowfeature flag isfalse, meaning this user is on the legacy Cytoscape-based service map. If the ReactFlow-based service map handles edge validation differently, this error may not occur when that flag is enabled.user-interaction, anderror.custom.classescontainseuiFieldNumber css-vehtd3-euiFieldNumber-compressed, suggesting the error was triggered after the user interacted with a numeric input field on the page (possibly adjusting the time range or comparison offset), which caused a service map re-render with updated data.rmdata_dev_serverless) and organization also appears in other SLO errors, suggesting this environment has a complex service topology that exercises edge cases in the APM UI.PageFatalReactError, meaning it crashes the entire page for the user.9.4.0(git revd83e0ba465ad) in an Azure northeurope serverless observability project.