-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Closed
Labels
area/wasmstalestalebot believes this issue/PR has not been touched recentlystalebot believes this issue/PR has not been touched recently
Description
Title: choose a naming scheme for per-Wasm VM metrics, such as memory_size or stage
Context:
- due to the nature of how
Wasmgets integrated into a host application, there will be always multiple independentWasmmodules (orWasm VMsin terms of envoy-wasm) running inside Envoy - there are some metrics, such as
memory sizeandstage, that we need to be able to track individually for eachWasm VM - as a result, we need to choose a naming scheme to identify individual
Wasm VMsinside Envoy
Proposal:
-
Let's use the following scheme for per-
Wasm VMmetricswasm.vm.<vm_id>.<vm_hash>.placement.<thread name>.instance.<vm instance number>.[<metric>]+ # and its tagged equivalent wasm_vm_instance_[<metric>]+{ envoy_wasm_vm_id="<vm_id>", envoy_wasm_vm_hash="<vm_hash>", envoy_wasm_vm_placement="<thread name>", envoy_wasm_vm_instance="<vm instance number>" }
where
<vm_id>is a (sanitized) value ofextensions.wasm.v3.VmConfig.vm_idfield, or"-"if the field is empty. This user-defined VM id gets included into metric name for the purpose of correlation with the underlying Envoy config<vm_hash>is a hash computed by Envoy while deduplicating VMs. This is the only unique id of a VM we have. Similarly togit, we can include into metric name only the first N characters from the HEX encoding of a hash<thread name>is a name of a thread the VM instance is running on, e.g."main_thread","worker_2", etc<vm instance number>is a sequential number to distinguish multiple instances of the same VM deployed on the same thread. There are several reasons for this to happen:- in the current envoy-wasm implementation, 2 VM instances get created on the
"main_thread" - in the future, with resource manager in place, a VM instance might get drained and replaced due to hitting the memory limit
- in the future, with resource manager in place, a VM instance might get drained and replaced to reclaim unused memory (Wasm spec only allows memory to grow but not to shrink)
- in the current envoy-wasm implementation, 2 VM instances get created on the
[<metric>]+is a metric name, such as"memory_size_bytes"or"stage.deleting"
E.g.,
wasm.vm.ext_authz_plugin.d54a291.placement.worker_2.instance.0.memory_size_bytes # and its tagged equivalent wasm_vm_instance_memory_size_bytes{ envoy_wasm_vm_id="ext_authz_plugin", envoy_wasm_vm_hash="d54a291", envoy_wasm_vm_placement="worker_2", envoy_wasm_vm_instance="0" }
wasm.vm.-.16c3ab8.placement.main_thread.instance.0.stage.deleting # and its tagged equivalent wasm_vm_instance_stage{ envoy_wasm_vm_id="-", envoy_wasm_vm_hash="16c3ab8", envoy_wasm_vm_placement="main_thread", envoy_wasm_vm_instance="0", envoy_wasm_vm_stage="deleting" }
-
Metrics about all VM instances on the same thread can be grouped under
wasm.vm.<vm_id>.<vm_hash>.placement.<thread name>.[<metric>]+
E.g.,
wasm.vm.ext_authz_plugin.d54a291.placement.worker_2.instances.active # and its tagged equivalent wasm_vm_instances_active{ envoy_wasm_vm_id="ext_authz_plugin", envoy_wasm_vm_hash="d54a291", envoy_wasm_vm_placement="worker_2" }
wasm.vm.ext_authz_plugin.d54a291.placement.worker_2.instance_restarts.memory_limit # and its tagged equivalent wasm_vm_instance_restarts{ envoy_wasm_vm_id="ext_authz_plugin", envoy_wasm_vm_hash="d54a291", envoy_wasm_vm_placement="worker_2", envoy_wasm_vm_restart_cause="memory_limit" }
-
Metrics about a VM itself can be grouped under
wasm.vm.<vm_id>.<vm_hash>.[<metric>]+
E.g.,
wasm.vm.ext_authz_plugin.a9b0e58.runtime.v8 # and its tagged equivalent wasm_vm_runtime{ envoy_wasm_vm_id="ext_authz_plugin", envoy_wasm_vm_hash="a9b0e58", envoy_wasm_vm_runtime="v8" }
wasm.vm.log_aggregator_plugin.d54a291.type.singleton # and its tagged equivalent wasm_vm_type{ envoy_wasm_vm_id="log_aggregator_plugin", envoy_wasm_vm_hash="d54a291", envoy_wasm_vm_type="singleton" }
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area/wasmstalestalebot believes this issue/PR has not been touched recentlystalebot believes this issue/PR has not been touched recently