dynamic_modules: add support for user-defined metrics#40661
dynamic_modules: add support for user-defined metrics#40661mathetake merged 8 commits intoenvoyproxy:mainfrom
Conversation
Similar to how it is done in wasm:
- DynamicModuleHttpFilterConfig now holds a Stats::Scope. All metrics
declared in this scope are placed in a custom stat namespace so as to
not collide with envoy-defined metrics. Using
Stats::CustomStatNamespaces, this namespace prefix is stripped from all
stat names before they are emitted, e.g. to Prometheus format.
- Add new ABI "method"s on
envoy_dynamic_module_http_filter_envoy_config_ptr that allow metrics to
be declared in the scope of the filter config. These methods return
opaque pointers to Stats::{Counter,Gauge,Histogram} that allow said
declared metrics to be manipulated. Right now, only the basics of each
metric type are exposed in the ABI. For example, you can't read metric
values, or reset metrics to 0, or...
- Add corresponding wrappers to the Rust SDK.
- Add tests, both against the C++ ABI implementation, and against the
Rust SDK.
Signed-off-by: Ian Kerins <git@isk.haus>
| /// Define a new counter scoped to this filter config with the given name. | ||
| fn new_counter(&self, name: &str) -> EnvoyCounter; |
There was a problem hiding this comment.
I see most other APIs in here returning dyn traits instead of structs directly, but I'm not sure what the point of that is here, if any. So I returned a struct. Let me know if I should use a trait instead.
There was a problem hiding this comment.
Also, should these APIs support tags?
| Stats::StatNameManagedStorage storage(name_view, filter_config->stats_scope_.symbolTable()); | ||
| Stats::StatName stat_name = storage.statName(); | ||
| Stats::Counter* c = &Stats::Utility::counterFromStatNames( | ||
| filter_config->stats_scope_, {filter_config->custom_stat_namespace_, stat_name}); |
There was a problem hiding this comment.
I'd be lying if I said I totally understand how StatName/storage works. I cargo culted this approach from the wasm filter. If anyone knows more than I do about stats, please pay close attention to this and let me know if it makes sense.
| // TODO should we allow callers to specify this? | ||
| Stats::Histogram::Unit::Unspecified); |
There was a problem hiding this comment.
Thoughts? The wasm filter does the same thing as this code, but I don't know why it has to be that way. I personally only export my stats to prometheus, where I'm pretty sure this unit does nothing? Not sure.
There was a problem hiding this comment.
yeah let's leave it as-is now until someone complains
| /// An HTTP filter configuration that implements | ||
| /// [`envoy_proxy_dynamic_modules_rust_sdk::HttpFilterConfig`] to test the stats | ||
| /// related callbacks. | ||
| struct StatsCallbacksFilterConfig { | ||
| streams_total: EnvoyCounter, | ||
| concurrent_streams: EnvoyGauge, | ||
| // It's full of 1s. | ||
| ones: EnvoyHistogram, | ||
| } |
There was a problem hiding this comment.
If this gets merged, we should probably add this kind of thing to the examples in https://github.com/envoyproxy/dynamic-modules-examples.
|
Hmm, clearly I broke something fundamental in https://github.com/envoyproxy/envoy/blob/b0c33ac4a58e65e22fcac0426fe12800dd4a8564/test/extensions/dynamic_modules/dynamic_modules_test.cc but I don't understand it. If anyone can offer pointers... |
Still one (big?) problem in dynamic_modules_test.cc... Signed-off-by: Ian Kerins <git@isk.haus>
e0e252c to
760d93d
Compare
|
Well, it passed in CI this time, which is strange. Still fails locally (macOS)... |
There was a problem hiding this comment.
Thanks for taking a shot at this! High level comments:
- Generally, if you have to return the pointers like the current impl, then you must implement the memory safety ABI (alloc, dealloc, or reference counter inc/dec) as well as the mapping of these safety into Rust's constructs. Currently, say,
EnvoyCountercan be escaped to any life time and that could cause invalid pointer crash very easily. Imagine you store them in a global map. I see there's no lifetime safety is implemented at all vs the existing ABIs are all protected by either Rust constructs or ABI level. - Create a new scope on the config initialization and attach have it in DynamicModuleHttpFilterConfig instead of using the global scope. Otherwise, stats are leaking before/after filter config updates.
- Since we have a filter-config scope like I suggest ^^, the stats's lifetime must be bound to the filter-config object. I would suggest to make "define stats" functions on
EnvoyHttpFilterConfigtrait return "IDs". These IDs are mapped to the real stats pointers stored inDynamicModuleHttpPerRouteFilterConfigobject on the Envoy side as a vector. Since the config initialization happens always before the HTTP stream handling, we can use the vector to store the stats pointers and use the index as an ID that will be returned to the modules via EnvoyHttpFilterConfig trait's method.
tldr is my suggested TODO for you is:
- Delete
envoy_dynamic_module_type_metric_counter_envoy_ptr, etc from abi.h - Rename envoy_dynamic_module_callback_metric_counter_new, etc to envoy_dynamic_module_callback_http_filter_config_define_metric_counter, etc (to be explicit about this being filter config scoped callback). and make them return the said "ID".
- Make the "updater" ABIs to accept
envoy_dynamic_module_type_http_filter_envoy_ptras well as an ID. In the implementation side, you can get the filter_config and from there you can simply check the said vector without locking or map lookup safely.
note: we can expand this to have "define" per request scope but that would end up requiring us to have locking. So, I would like to not have it for now unless there's an elegant solution which I don't have atm.
note2: tags handling, we might want to investigate how other filters are dealing with. We might want to study some ideas how Istio's stats filters are dealing with it. They are adding "per-request" attributes as tags. https://github.com/istio/proxy/blob/131a090f997edab4e4dc59f05ef265d895f7885e/source/extensions/filters/http/istio_stats/istio_stats.cc#L376
|
Okay, thanks, that makes sense to me. I will try it out. |
|
Well, one thing I’m unsure of: when using IDs for stats, how can that result in a different outcome if IDs are stored past their valid lifetimes in the module? An ID could point past the end of the vector, or the vector could have changed such that an ID points at a different metric than it originally did. IDs can be made invalid just like pointers can, right? So what is the benefit? Whether we use IDs or pointers, we have to get the rust lifetimes correct either way, don’t we? |
|
I'm sorry for these tedious questions, but my experience with both C++ and Rust is minimal 🥲. |
|
ok i was thinking we don't need to think about that case. ID is passed together with envoy_dynamic_module_type_http_filter_envoy_ptr which holds the pointer to the underlying Envoy's FIlterConfig object that eventually holds vector. It should be no issue as long as modules are not intentionally passing invalid IDs together with a wrong envoy_dynamic_module_type_http_filter_config_envoy_ptr? The basic assumption of dynamic modules is modules must be trusted. so we don't need to care about these "misbehaving" modules. In fact, we have plenty of codes that can crash envoy if the modules are malicious since modules have the same privilege as any other native filters. |
|
That is totally different than the fact that not exposing raw pointer, like the current impl, completely forbids the Rust code from writing the crash since the "updates" ABI binds the lifetime of the underlying stats objects if you require the ABI callbacks to pass "filter pointers" vs not as in the current impl. |
also this doesn't happen as I described in the review comment ^^. Let me repeat: "define ABIs" are only available during filter config initialization that happens only once before any stream handling. That is why I suggested the use of vector and it's immutable by the time when "updates" are available, which doesn't require lock or whatever. |
|
On second thought; using the raw pointer address as an ID would be fine as long as the "updates" callbacks are bounded within the lifecycles of filter_config as per the suggestion. That way, we won't need to maintain vector at all. e |
- Create a child stats scope bound to DynamicModuleHttpFilterConfig. - Rename ABI functions as suggested. - Freeze stat creation at the end of config initialization. Signed-off-by: Ian Kerins <git@isk.haus>
|
@mathetake I think I've addressed the easy parts of your feedback, but I am struggling to prevent using I have tried annotating many things with lifetimes, trying to make these structs have the same lifetime as EnvoyHttpFilterConfig, but it requires annotating tons of structs, traits, and functions (it'd be a big breaking change to the SDK), and I still can't make it compile in the end. Do you think it's possible to make it work this way, or should I be trying to do something else? |
|
@isker @mathetake Prototyped something like this to keep the Rust references bounded by the lifetime of the filter config and filter: wtzhang23@02f32ee Let me know if this helps :) |
Thanks @wtzhang23, and the idea looks great but I would rather against exposing the entire EnvoyHttpFilterConfig to per-request struct since that would mean exposing the mutable non-thread safe operations (define_* metrics ABI) on to worker threads vs only exposing the safe operations (increments). |
|
My suggestion is simple:
This will prevent at least the invalid pointers crash on the misuse due to the lifetime issue. The only remaining problem is when user passing IDs to the filter not created from the filter config that didn't create the metric/id. I think that would be a negligible as that means the "insane" ID passing happening via global shared memory. Even it that happens, Envoy won't crash by boundary check vs previously in the initial iteration of this PR, one could simply store the metrics pointer in the global cache (which I believe one could do that without thinking too much). wdyt? I think there won't be a perfect solution at the point. so maybe the question is how much we could reduce the opportunity for module devs to write wrong code. I am 100% open to other ideas. /wait |
|
Not sure if it is possible in standard Envoy filters but if it is, I can see a use case where someone would want to update a metric on the Like a metric |
|
Yeah we can add them two in addition to the callbacks above |
|
Won't the ABI as written in wtzhang23@02f32ee still allow null pointers to be dereferenced when you use an index into the vector that's out of bounds? |
|
Oh, maybe not with @mathetake's patch. Okay. I will try to cherry pick this all together. |
|
Another alternative API where you still get a reference to counter, gauge, and histogram when declaring metrics: wtzhang23@b07c1f0. Also includes fixes to the existing test cases (though not full coverage), fixes to the Rust So, addressing
But not
I don't have a strong opinion on using ids vs first exchanging the ids with a pointer then using the pointer directly. I think both will end up safe. Tradeoffs off the top of my head: Pros with using pointer:
Pros with defining callbacks that take the "context" and id as arguments:
There's also the third approach of mixing the two approaches and having define_* return ids while also defining callbacks to exchange them for metric pointers. This is probably better than having define_* return metric pointers directly since we can then get rid of the |
|
thanks, but I would prefer the simpler C interface as well as less methods/struct at Rust layer compared to my suggestion plus your revised version still didn't address my comment about per-request object being able to define metrics concurrently. Let's go with my suggestion for now, and shall we see what improvement can be made based on your effort to unblock the feature at least in a follow up if necessary? We can break ABI at any point as we don't guarantee the compatibility across versions. |
|
For tags specifically, we should study how Istio does for their stats extensions (in native C++) as per my old comment, but we can do that in a follow up. Let's get the basic stats functionality working first
|
|
@isker I think you should be able to just cherry-pick this commit: Edit: wtzhang23@f6a26fa without the whitespace changes |
Signed-off-by: William Zhang <wtzhang23@gmail.com>
Signed-off-by: Ian Kerins <git@isk.haus>
|
I have cherry picked @wtzhang23's commit and added a fix to test compilation, which was failing. That fix reveals that, with this design:
It is not possible to increment counters in Let me know what you think, though. I am not sure it would be a common desire to manipulate counters in |
|
Nevermind what I said, I did not realize there were conflicts with #40864, which I think will fix this problem. |
This reverts commit 1152c32. Signed-off-by: Ian Kerins <git@isk.haus>
|
/wait |
|
yeah sorry will do the review by the end of tomorrow 🙏 |
| } | ||
|
|
||
| Stats::Counter& getCounterById(size_t id) const { | ||
| ASSERT(id < counters_.size()); |
There was a problem hiding this comment.
note: ASSERT is only available on the debug build so this doing not so much as users(module owners) are likely to work with release builds, but yeah harmless.
mathetake
left a comment
There was a problem hiding this comment.
Thank you for your contribution and multiple iterations here! I think this could be a good start for us and at least unlocks basic use cases. We can iterate more on this to improve e.g. tag/label creation as discussed!
|
Thanks to you both for struggling through this with me 🫡. |
Commit Message: dynamic_modules: add support for user-defined metrics
Additional Description: Similar to how it is done in wasm:
- DynamicModuleHttpFilterConfig now holds a Stats::Scope. All metrics
declared in this scope are placed in a custom stat namespace so as to
not collide with envoy-defined metrics. Using
Stats::CustomStatNamespaces, this namespace prefix is stripped from all
stat names before they are emitted, e.g. to Prometheus format.
- Add new ABI "method"s on
envoy_dynamic_module_http_filter_envoy_config_ptr that allow metrics to
be declared in the scope of the filter config. These methods return
opaque pointers to Stats::{Counter,Gauge,Histogram} that allow said
declared metrics to be manipulated. Right now, only the basics of each
metric type are exposed in the ABI. For example, you can't read metric
values, or reset metrics to 0, or...
- Add corresponding wrappers to the Rust SDK.
- Add tests, both against the C++ ABI implementation, and against the
Rust SDK.
Risk Level: Low, in that this is a new feature in the experimental
dynamic modules system.
Testing: Tests for both the C++ implementation in isolation, and in the
rust "integration" tests.
Docs Changes: There are not any docs covering specific dynamic modules
features (yet?), so I've added nothing beyond doc comments.
Release Notes:
Platform Specific Features: N/A
---------
Signed-off-by: Ian Kerins <git@isk.haus>
Signed-off-by: Leonardo da Mata <ldamata@spotify.com>
Commit Message: dynamic_modules: add support for user-defined metrics
Additional Description: Similar to how it is done in wasm:
- DynamicModuleHttpFilterConfig now holds a Stats::Scope. All metrics
declared in this scope are placed in a custom stat namespace so as to
not collide with envoy-defined metrics. Using
Stats::CustomStatNamespaces, this namespace prefix is stripped from all
stat names before they are emitted, e.g. to Prometheus format.
- Add new ABI "method"s on
envoy_dynamic_module_http_filter_envoy_config_ptr that allow metrics to
be declared in the scope of the filter config. These methods return
opaque pointers to Stats::{Counter,Gauge,Histogram} that allow said
declared metrics to be manipulated. Right now, only the basics of each
metric type are exposed in the ABI. For example, you can't read metric
values, or reset metrics to 0, or...
- Add corresponding wrappers to the Rust SDK.
- Add tests, both against the C++ ABI implementation, and against the
Rust SDK.
Risk Level: Low, in that this is a new feature in the experimental
dynamic modules system.
Testing: Tests for both the C++ implementation in isolation, and in the
rust "integration" tests.
Docs Changes: There are not any docs covering specific dynamic modules
features (yet?), so I've added nothing beyond doc comments.
Release Notes:
Platform Specific Features: N/A
---------
Signed-off-by: Ian Kerins <git@isk.haus>
Commit Message: dynamic_modules: add support for user-defined metrics
Additional Description: Similar to how it is done in wasm:
- DynamicModuleHttpFilterConfig now holds a Stats::Scope. All metrics
declared in this scope are placed in a custom stat namespace so as to
not collide with envoy-defined metrics. Using
Stats::CustomStatNamespaces, this namespace prefix is stripped from all
stat names before they are emitted, e.g. to Prometheus format.
- Add new ABI "method"s on
envoy_dynamic_module_http_filter_envoy_config_ptr that allow metrics to
be declared in the scope of the filter config. These methods return
opaque pointers to Stats::{Counter,Gauge,Histogram} that allow said
declared metrics to be manipulated. Right now, only the basics of each
metric type are exposed in the ABI. For example, you can't read metric
values, or reset metrics to 0, or...
- Add corresponding wrappers to the Rust SDK.
- Add tests, both against the C++ ABI implementation, and against the
Rust SDK.
Risk Level: Low, in that this is a new feature in the experimental
dynamic modules system.
Testing: Tests for both the C++ implementation in isolation, and in the
rust "integration" tests.
Docs Changes: There are not any docs covering specific dynamic modules
features (yet?), so I've added nothing beyond doc comments.
Release Notes:
Platform Specific Features: N/A
---------
Signed-off-by: Ian Kerins <git@isk.haus>
Signed-off-by: Misha Badov <mbadov@google.com>
Commit Message: dynamic_modules: add support for user-defined metrics
Additional Description: Similar to how it is done in wasm:
Stats::CustomStatNamespaces, this namespace prefix is stripped from all stat names before they are emitted, e.g. to Prometheus format.
Risk Level: Low, in that this is a new feature in the experimental dynamic modules system.
Testing: Tests for both the C++ implementation in isolation, and in the rust "integration" tests.
Docs Changes: There are not any docs covering specific dynamic modules features (yet?), so I've added nothing beyond doc comments.
Release Notes:
Platform Specific Features: N/A
Ref: #38392.
cc @mathetake