Skip to content
This repository was archived by the owner on Dec 16, 2020. It is now read-only.

implement expression evaluation in Wasm#345

Merged
jplevyak merged 20 commits intoenvoyproxy:masterfrom
kyessenov:expr_api2
Jan 24, 2020
Merged

implement expression evaluation in Wasm#345
jplevyak merged 20 commits intoenvoyproxy:masterfrom
kyessenov:expr_api2

Conversation

@kyessenov
Copy link
Copy Markdown
Contributor

@kyessenov kyessenov commented Dec 10, 2019

Depends on #343
Fixes #176

  • use int64 for passing time and duration
  • support null message as an evaluation result
  • add create/delete expression API in the root context
  • add eval expression API in all contexts
  • add tests for two sample expressions:
    1. "server is " + request.headers["server"]
    2. delegate protobuf creation to host:
envoy.api.v2.core.GrpcService{
  envoy_grpc: envoy.api.v2.core.GrpcService.EnvoyGrpc {
    cluster_name: "test"
  }
}

/assign @jplevyak
/assign @PiotrSikora
/cc @mandarjog

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
@kyessenov kyessenov requested a review from lizan as a code owner December 12, 2019 21:26
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
if (!buf.has_value()) {
return false;
}
if (buf.value()->size() == 0) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code can be de-duplicated btw evaluateMessage and getMessage?
and similar for evaluate and getvalue for string?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sprawled over two headers. We probably need to organize the header better.

// lazily create a builder
if (!builder_) {
google::api::expr::runtime::InterpreterOptions options;
builder_ = google::api::expr::runtime::CreateCelExpressionBuilder(options);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declare google::api::expr using "using" to shorten the names?

auto cel_expression_status =
builder_->CreateExpression(&parse_status.ValueOrDie().expr(), &source_info);
if (!cel_expression_status.ok()) {
ENVOY_LOG(info, "wasm expr compile {}: {}", log_prefix(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be an error?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an error in ABI since result is BadArgument. The error message could be copied to the client though, not sure if the client can do anything meaningful with it.

// Evaluator state
Filters::Common::Expr::BuilderPtr builder_{};
uint32_t next_expr_token_ = 0;
std::map<uint32_t, Filters::Common::Expr::ExpressionPtr> expr_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we are using absl stuff, we can use absl::flat_hash_map here?
It is said to be more perform-ant then std::map..

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here is very opinionated, so I'm trying to conform to the existing style. Other places use map.

@jplevyak
Copy link
Copy Markdown
Contributor

Why do we need this? This seems like a huge amount of complexity and a big dependency without any gain. Let's defer this until we have a compelling use case.

/wait

@kyessenov
Copy link
Copy Markdown
Contributor Author

@jplevyak we need to provide configurability for Istio stats similar to how Mixer allows changing metrics dimensions via a YAML file. We have two major options:

  1. Configuration is code, e.g. the YAML author now needs to recompile stats extensions with some code changes. I don't think this is viable in the short term, we have pretty rudimentary Wasm author experience, far from a fully automated compilation service in the cluster.
  2. Configuration is data, e.g. the overrides are some text expressions. We can possibly embed a parser into the extensions and move the evaluator into Wasm. These has two problems:
  • high overhead of data transfer (an expression needs the entire context to be transferred)
  • high overhead of double interpretation (it's not clear what the performance of interpreter-in-an-interpreter is going to be like)

The options proposed here is 2) but with host assistance for evaluation. It's leveraging an existing expression runtime for performance reasons.

@kyessenov
Copy link
Copy Markdown
Contributor Author

kyessenov commented Dec 13, 2019

By the way, CEL is intentionally a minimalistic language since the security of its runtime is a priority. Its bounded computational and expressive power are its primary advantage, so it is well-suited for the limited configurability that we need. The compilation service has a massive challenge to ensure security of the generated Wasm binaries.

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
@PiotrSikora
Copy link
Copy Markdown
Contributor

I chatted with @jplevyak about it, and it's not clear to us what's the benefit of offloading this to the host environment (and making it part of the ABI) instead of simply bundling it in the Wasm module? It sounds that CEL runtime is pretty well suited for being embedded.

@kyessenov
Copy link
Copy Markdown
Contributor Author

kyessenov commented Dec 18, 2019

@PiotrSikora I already mentioned that, but there are several advantages:

  1. Less data transfer. We can write comprehension expression that would allow us save on passing the attribute context. E.g. check for existence in a map by some predicate.
  2. Less complexity in modules. We wouldn't need to provide multiple implementations of CEL to have a consistent user experience in the config. We wouldn't need to compile CEL CPP (it depends on ANTLR, absl, protobuf).
  3. Parsing/evaluation is security sensitive. The host parser/evaluator can be hardened, we can't be sure every module is written with security in mind.

@kyessenov
Copy link
Copy Markdown
Contributor Author

I can imagine a system that allows cross-module communication, and we can delegate parts of ABI to other modules. E.g. make one module invoke another module. Current ABI doesn't support that, but we can evolve ABI chunks out of host functions down the line.

@mandarjog
Copy link
Copy Markdown
Contributor

@PiotrSikora this is I the extending proposal, but making it part of the abi means that every language can use the same bindings.

@jplevyak
Copy link
Copy Markdown
Contributor

@mandarjog @kyessenov

First, WASM very close to native performance, so there is no "interpreter on interpreter" overhead.

Second, CEL is very very heavyweight and expensive to interpret. Compared to the cost of moving data from the host to the VM, CEL will be much more expensive.

If you believe otherwise I suggest that you create a proper benchmark and we can see.

Third, a custom solution is going to be much faster, simpler and easier to maintain. I am willing to design and build a such a system so that we can compare it by benchmark to the CEL system.

Loser buys the beer :)

@kyessenov
Copy link
Copy Markdown
Contributor Author

@jplevyak I am not arguing purely on performance front. There's been minimal work done to optimize CEL performance (no JITting, no bytecode). That's not the point.

Do you imagine we can have an end-to-end pipeline that allows an operator convert a YAML file to a WASM module in the near future without ever touching a compiler?

Any custom language has a huge cost of documenting and maintaining it. Look at what happened to CEXL fork in Istio, and its frozen state. We need a toolchain, a formal grammar, and a committed user (e.g. Google IAM) to make the experience for the operator tolerable.

@kyessenov
Copy link
Copy Markdown
Contributor Author

kyessenov commented Dec 18, 2019

Btw, if you believe WASM offers better performance, we can simply leverage that in CEL by compiling CEL to WASM byte code. I have an experimental compiler, but it's written in go, so can't be embedded into Envoy. It would be possibly to rewrite that to C++ to produce JITted WASM byte code. I don't think we can add any more complicated compiler into Envoy without bloating it.

@jplevyak
Copy link
Copy Markdown
Contributor

CEL is a full on DSL. The purpose of having WASM as a performant extension language is so that we don't need DSLs: we can do the work in the WASM code proper. This is more flexible, easier to upgrade, etc. The ABI surface is and should be minimal. CEL adds a huge ABI surface and massive dependency and the net result is going to be slower. For the use case you are mentioning: specifying metric dimensions I would imagine some few dozen lines of C++ in the WASM module for a non-CEL implementation. In fact, I would imaging that the non-CEL version will be overall smaller and simpler than a CEL implementation, not to mention much faster and more efficient.

@kyessenov
Copy link
Copy Markdown
Contributor Author

John, I don't think we can avoid the use of DSL. I can't imagine a Kubernetes operator writing C++ to change some metric label and then also dealing with the emscipten toolchain. Of course, it's going to be faster to have C++ right now (not using Wasm is also faster). The default happy path should probably use a specialized version without DSL.

Now if we agree we need a DSL, I'm going to argue against making up some one-off DSL once again.

@jplevyak
Copy link
Copy Markdown
Contributor

We don't need a DSL. We can simply interpret the dimension specifiers as data in WASM.

@kyessenov
Copy link
Copy Markdown
Contributor Author

Having a list of specifiers is too restrictive, IMHO. How do I say use HTTP status code or gRPC status code?

@PiotrSikora
Copy link
Copy Markdown
Contributor

PiotrSikora commented Dec 18, 2019

  1. Less data transfer. We can write comprehension expression that would allow us save on passing the attribute context. E.g. check for existence in a map by some predicate.
  2. Less complexity in modules. We wouldn't need to provide multiple implementations of CEL to have a consistent user experience in the config. We wouldn't need to compile CEL CPP (it depends on ANTLR, absl, protobuf).
  3. Parsing/evaluation is security sensitive. The host parser/evaluator can be hardened, we can't be sure every module is written with security in mind.

Neither of those are "good enough" reasons to move this computation (whether it's CEL or @jplevyak's custom solution) to the host environment, instead of having it done inside the VM.

The main motivation behind Wasm filters was the ability to fix/upgrade them on the fly via xDS, without the need to fix/upgrade the Envoy binary. If we keep pushing stuff to Envoy, then we're sacrificing this ability for a small performance gain (like @jplevyak mentioned, Wasm is pretty close to native performance and it's only going to get better over time), which doesn't seem like a good trade-off.

@kyessenov
Copy link
Copy Markdown
Contributor Author

@PiotrSikora do you have a proposal to package shared features outside of the host to be shared between VMs? I have no issue delegating evaluation to another module, but I don't see how to invoke a VM from a VM. How do we register extern functions in the ABI?

FWIW this can be left as an optional extension to ABI. Host has no obligation to implement it, but it surely benefits Istio.

@PiotrSikora
Copy link
Copy Markdown
Contributor

@kyessenov we've discussed this cross-module communications in the past, but it's not something on any immediate roadmap, at least not that I'm aware of. In any case, I'm not sure if cross-module dependencies are desirable, especially in this particular case.

@kyessenov
Copy link
Copy Markdown
Contributor Author

I got curious about performance. I wrote two Wasm versions: one using CEL taking a literal string and calling eval, and another one by hand-writing C++ for the eval function. TL;DR performance difference is negligible between the two:

// Collected with bazel run -c opt --define=wasm=v8 test/extensions/wasm:wasm_speed_test
// Raw evaluate size(plugin_name)
// BM_WasmSimpleCallSpeedTest/V8SpeedTest          483 ns          483 ns      1434823
  std::string property = "plugin_name";
  const char* value_ptr = nullptr;
  size_t value_size = 0;
  auto result = proxy_get_property(property.data(), property.size(), &value_ptr, &value_size);
  if (WasmResult::Ok != result) {
    logError("bad result for getProperty");
  }
  if (value_size != 107) {
    logError("bad size");
  }
  ::free((void*)value_ptr);
// CEL evaluate size(plugin_name)
// BM_WasmSimpleCallSpeedTest/V8SpeedTest          576 ns          576 ns      1219428
  if (token == -1) {
    std::string expr = "size(plugin_name)";
    if (WasmResult::Ok != proxy_expr_create(expr.data(), expr.size(), &token)) {
      logError("bad expression");
    }
  }
  const char* value_ptr = nullptr;
  size_t value_size = 0;
  if (WasmResult::Ok != proxy_expr_eval(token, &value_ptr, &value_size)) {
    logError("bad evaluation");
  }
  if (*reinterpret_cast<const int64_t*>(value_ptr) != 107) {
    logError("bad result");
  }
  ::free((void*)value_ptr);

// Raw evaluate a protobuf marshal for grpc_service
// BM_WasmSimpleCallSpeedTest/V8SpeedTest         1212 ns         1212 ns       548693
 std::string property = "plugin_name";
  const char* value_ptr = nullptr;
  size_t value_size = 0;
  auto result = proxy_get_property(property.data(), property.size(), &value_ptr, &value_size);
  if (WasmResult::Ok != result) {
    logError("bad result for getProperty");
  }
  if (value_size != 107) {
    logError("bad size");
  }
  GrpcService grpc_service;
  grpc_service.mutable_envoy_grpc()->set_cluster_name(std::string(value_ptr, value_size));
  std::string grpc_service_string;
  grpc_service.SerializeToString(&grpc_service_string);
  ::free((void*)value_ptr);

// CEL evaluate a protobuf marshal for grpc service
// BM_WasmSimpleCallSpeedTest/V8SpeedTest         1198 ns         1197 ns       596187
  if (token == -1) {
    std::string expr = "envoy.api.v2.core.GrpcService{ envoy_grpc: envoy.api.v2.core.GrpcService.EnvoyGrpc { cluster_name: plugin_name }}";
    if (WasmResult::Ok != proxy_expr_create(expr.data(), expr.size(), &token)) {
      logError("bad expression");
    }
  }

  const char* value_ptr = nullptr;
  size_t value_size = 0;
  if (WasmResult::Ok != proxy_expr_eval(token, &value_ptr, &value_size)) {
    logError("bad evaluation");
  }
  ::free((void*)value_ptr);


auto token = next_expr_token_++;
for (;;) {
if (!expr_.count(token)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

token is monotonically increasing. Does the map have a different life time than next_expr_token ?
when will this condition occur?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Map is hosted in the root context I think. It's a common pattern in this file to have this token loop.

return WasmResult::NotFound;
}
Protobuf::Arena arena;
auto eval_status = it->second.compiled_expr_->Evaluate(*this, &arena);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs an Activation no? is Context a BaseActivation ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it implements the interface.

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>

// Create an expression using a foreign function call.
inline WasmResult exprCreate(StringView expr, uint32_t* token) {
std::string function = "expr_create";
Copy link
Copy Markdown
Contributor

@mandarjog mandarjog Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming: Is this just expr_create or cel_expr_create?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with either. compressor is not using zip in its name, so I thought I'd do the same.

size_t out_size = 0;
auto result = proxy_call_foreign_function(function.data(), function.size(), expr.data(),
expr.size(), &out, &out_size);
if (result != WasmResult::Ok) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to free "out" or wrap it in a WasmDataPtr

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

}

template <typename T>
inline bool evaluate(uint32_t token, T* out) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evaluateExpression(). "evaluate" is too generic.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

// Evaluate an expression using an expression token.
inline Optional<WasmDataPtr> exprEval(uint32_t token) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move these to an separate header file. Anything built on top of a foreign function should be in its own file, perhaps in a separate directory e.g. contrib.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

using PluginSharedPtr = std::shared_ptr<Plugin>;

// Opaque context object.
class ContextObject {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this StoredObject and we can reuse it for storing things in the Wasm object as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
@jplevyak jplevyak merged commit 5a9fbd5 into envoyproxy:master Jan 24, 2020
@PiotrSikora
Copy link
Copy Markdown
Contributor

Erm, this broke the ASan build for real now...

@PiotrSikora
Copy link
Copy Markdown
Contributor

Unbreaking ASan in #382.

bianpengyuan pushed a commit to bianpengyuan/envoy-wasm that referenced this pull request Feb 8, 2020
* expr_api

Signed-off-by: Kuat Yessenov <kuat@google.com>
bianpengyuan pushed a commit to bianpengyuan/envoy-wasm that referenced this pull request Feb 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WASM: Add CEL exec API

5 participants