Skip to content

router: implement new regex_rewrite route action#10050

Merged
lizan merged 16 commits intoenvoyproxy:masterfrom
jphx:regex-rewrite
Feb 26, 2020
Merged

router: implement new regex_rewrite route action#10050
lizan merged 16 commits intoenvoyproxy:masterfrom
jphx:regex-rewrite

Conversation

@jphx
Copy link
Copy Markdown
Contributor

@jphx jphx commented Feb 13, 2020

Description:

Support path rewriting using regular expressions and optionally capture groups. This PR is like #8462, but using the safe regular expression support.

Risk Level: Medium, since it slightly modifies the existing prefix_rewrite code.

Testing: Unit tests are added to test/common/router/config_impl_test.cc, runnable with bazel test //test/common/router:config_impl_test

Docs Changes: Any doc that references prefix_rewrite has been changed to reference regex_rewrite as well, if appropriate.

Release Notes: A bullet is added to docs/root/intro/version_history.rst mentioning the new support.

Notes:

I chose to make regex_rewrite an alternative to prefix_rewrite, with prefix_rewrite taking precedence an error being recognized if both are specified. It would also be possible to support both route actions at the same time, applying, say, prefix_rewrite first, and then regex_rewrite. I didn't implement it this way, though.

@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/.

🐱

Caused by: #10050 was opened by jphx.

see: more, trace.

James Hennessy added 6 commits February 13, 2020 14:04
Signed-off-by: James Hennessy <jph@us.ibm.com>
Signed-off-by: James Hennessy <jph@us.ibm.com>
Signed-off-by: James Hennessy <jph@us.ibm.com>
Signed-off-by: James Hennessy <jph@us.ibm.com>
Signed-off-by: James Hennessy <jph@us.ibm.com>
…king rules

Signed-off-by: James Hennessy <jph@us.ibm.com>
@zuercher zuercher self-assigned this Feb 14, 2020
@cmluciano
Copy link
Copy Markdown
Member

cmluciano commented Feb 14, 2020

xref: #2092

@jphx
Copy link
Copy Markdown
Contributor Author

jphx commented Feb 14, 2020

Actually, it doesn't fix #2092. That issue is asking for a rewrite while redirecting. This change is for rewriting when forwarding.

@zuercher zuercher assigned mattklein123 and unassigned zuercher Feb 14, 2020
@zuercher
Copy link
Copy Markdown
Member

Assigning @mattklein123 -- I think we should review the API change first and then the code to match.

From my point of view, I think we'll want to add [(udpa.annotations.field_migrate).oneof_promotion = "<oneof name>"] to have prefix_rewrite and regex_rewrite become members of a oneof group (in v4, I guess).

My other thought was that it might make sense to consider whether this rewrite should be separate from RouteMatch or not. On the one hand this allows a simple match (say /foo) which is then rewritten using a regex. The flip side is if you want to regex match and regex rewrite you may have to define the same regex twice in the config (and execute it twice while routing).

@jphx
Copy link
Copy Markdown
Contributor Author

jphx commented Feb 14, 2020

I thought about whether the regular expression should be the regex_match regular expression. If that's done, it might be possible to match just once instead of twice as in this implementation, though I doubt it if one of the re2 "replace" methods will be used to perform the substitution.

There's also more flexibility in allowing a completely different regular expression, and processing it even if there's a prefix match or a full path match, instead of a regular expression match. Also, the regex_match regular expression doesn't have to match the entire path. It can match and replace just pieces of the path, and match/replace multiple pieces. So in the end, I opted for the flexibility. But as you say, if the user only wants to use it in conjunction with regex_match, they'll be forced to repeat the regular expression, and to anchor it in regex_rewrite if they want it to be applied in the same way.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level this LGTM and will defer to @zuercher for review. Thank you for working on this!

XSS
YAML
ZXID
aaa
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zuercher we really need to fix this. If you are up for a tech debt item can you potentially add some type of escape hatch to the spell checker that turns it off and on like we have for other format things? I think there is an issue tracking this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be feasible to skip the spell check for text inside double back ticks?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I'm going to work on that next.

James Hennessy added 3 commits February 18, 2020 14:56
This updates the descriptions of prefix_rewrite and regex_rewrite to
say that only one should be specified, and that prefix_rewrite takes
precedence if both are.

It also moved the bullet in version_history.rst to the "router"
section, so they're in alphabetical order.

Signed-off-by: James Hennessy <jph@us.ibm.com>
Also add an annotation to the v3 proto files indicating that prefix_rewrite
and regex_rewrite will be migrated in v4 to a "oneof" choice called
path_rewrite.

Signed-off-by: James Hennessy <jph@us.ibm.com>
Signed-off-by: James Hennessy <jph@us.ibm.com>
@jphx
Copy link
Copy Markdown
Contributor Author

jphx commented Feb 19, 2020

Fyi, I merged the master branch into the feature branch to resolve the merge conflict.

Copy link
Copy Markdown
Member

@zuercher zuercher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bearing with us! Some comments below, but mostly looks good.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I guess based on host_rewrite_specifier these should be path_rewrite_specifier. I thought the name in the docs was descriptive.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I notice there's already a path_rewrite_specifier in the RedirectAction section.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine for that to be repeated in different types.

It does strike me that someone will eventually want to use RegexRewrite in RedirectAction.

@mattklein123: should we move RegexWrite so that it's not nested within RouteAction? And if so, the name might need to be a little more specific to paths, like RegexPathRewrite.

(@jphx sorry again for this piecemeal approach to the API review -- the maintainer with the best vision for this is currently unavailable.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a problem. I'm willing to make whatever changes are best for Envoy (within reason!).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to have stalled on this review. Is there anything I can do to help? Are we waiting for the maintainer with the best vision to be available again?

If desired, I can move the RegexRewrite message out of the RouteAction message so that it can be more easily re-used in the RedirectAction message. Or it could be moved out of the file altogether.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we are stalled. @lizan had a comment below. Can you take a look at that and we can move forward?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

His comment was asking if the RegexRewrite message should be moved out of the file altogether and made a generic type. It was my impression he was looking for the opinion of other maintainers, since it is a strategy question. Surely you don't want me to decide the answer to that question.

If you all agree this is what should be done, then I would be happy to make that change.

James Hennessy added 2 commits February 19, 2020 15:45
Signed-off-by: James Hennessy <jph@us.ibm.com>
Signed-off-by: James Hennessy <jph@us.ibm.com>
@jphx
Copy link
Copy Markdown
Contributor Author

jphx commented Feb 19, 2020

I'm not sure why the envoy-linux (format) check is failing. It seems to want me to remove the oneof_promotion annotation that we just added. A locally-run

ci/run_envoy_docker.sh ci/do_ci.sh fix_format

command does the same thing to the files in my working copy. Any suggestions?

@zuercher
Copy link
Copy Markdown
Member

Hm. I guess that format error means the upda annotation needs to go in the v2 version as well. That doesn't seem logical to me, but I think it might be a corner case of v2/v3 coexisting. Soon v2 will be frozen and we won't allow changes there anymore and the auto-translation will get turned off.

@zuercher
Copy link
Copy Markdown
Member

Also, please avoid force-pushes in the future. They make tracking incremental PR changes harder and we squash PRs onto master so the dirty commit history in the PR won't be reflected on master.

@jphx
Copy link
Copy Markdown
Contributor Author

jphx commented Feb 19, 2020

Sorry about the forced pushes. I've used them only to amend commit messages to add the signoff. I seem to have trouble remembering the use the --signoff option on my git commit and git merge commands!

@jphx
Copy link
Copy Markdown
Contributor Author

jphx commented Feb 19, 2020

I looked into:

I guess that format error means the upda annotation needs to go in the v2 version as well.

but adding the oneof_promotion annotation to the v2 files doesn't seem to help. When I do that, the fix_format operation changes the v3 files to actually put the fields under a oneof.

Earlier, @mattklein123 said:

Actually I think we can't do it right now as we need to do it on v3 and have it show up for v4, which the tooling does not support yet.

So maybe this annotation needs to be added sometime later?

@jphx
Copy link
Copy Markdown
Contributor Author

jphx commented Feb 20, 2020

I merged again from master to address the coverage check failure, and removed the oneof_promotion annotations for now. I'll open an issue to track adding them back tomorrow.

google.protobuf.BoolValue enabled = 2;
}

// Describe how to match the path and rewrite it using a regular expression
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we extract this type into envoy.type? I think we're likely to reuse it in other places?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through this again, yes, I think it makes sense to move this to a matcher. How about RegexMatchAndSubstitute or something like that?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through this again, yes, I think it makes sense to move this to a matcher. How about RegexMatchAndSubstitute or something like that?

Copy link
Copy Markdown
Contributor Author

@jphx jphx Feb 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I merged from the master branch again to resolve some conflicts and made the requested change, renaming the RegexRewrite message to RegexMatchAndSubstitute and making it a general type so it can potentially be re-used elsewhere.

@mattklein123 mattklein123 self-assigned this Feb 25, 2020
James Hennessy added 2 commits February 25, 2020 14:33
Signed-off-by: James Hennessy <jph@us.ibm.com>
…Substitute.

Signed-off-by: James Hennessy <jph@us.ibm.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM other than a test request. @zuercher @lizan any further comments?

/wait


if (route.route().has_regex_rewrite()) {
if (!prefix_rewrite_.empty()) {
throw EnvoyException("Cannot specify both prefix_rewrite and regex_rewrite");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please a test case for this. You can use EXPECT_THROW_WITH_MESSAGE for testing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this test case in the latest commit.

…sage.

Also add a test that verifies an exception is raised if both prefix_rewrite
and regex_rewrite are specified in a route.

Signed-off-by: James Hennessy <jph@us.ibm.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jphx
Copy link
Copy Markdown
Contributor Author

jphx commented Feb 26, 2020

/retest

The coverage test failed, but it looks like the process was killed by a signal. Trying again.

@repokitteh-read-only
Copy link
Copy Markdown

🔨 rebuilding ci/circleci: coverage (failed build)

🐱

Caused by: a #10050 (comment) was created by @jphx.

see: more, trace.

@lizan lizan merged commit 10d40f5 into envoyproxy:master Feb 26, 2020
@jphx
Copy link
Copy Markdown
Contributor Author

jphx commented Feb 26, 2020

Thank you everybody!

@jphx jphx deleted the regex-rewrite branch February 26, 2020 23:18
htuch pushed a commit that referenced this pull request Mar 13, 2020
I'm seeing a failure in OSS-Fuzz build, as well as fuzz CI tool #10289 due to fuzzing build macro FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION.

Basically, PR #10050 introduced regex rewrites that are tested in config_impl_test, which is executed to create the corpus for route_fuzz_test. Fuzz builds are built with the FUZZING_BUILD_MODE... macro, and the RE2 library that does the regex rewrites in the test only does single replacements in fuzzing mode rather than the global replacement in the tests, so the test fails, and the fuzz build fails.

This change defined FUZZING_BUILD_MODE... only for fuzzing build rules, and explicitly undefines them for the config_impl_test.

RE OSS-Fuzz builds, this is odd. Bazel CLI flags take precedence, so I'm not sure what to do when we build in that environment. I can't seem to #undef in the tests, but I need to come up with some solution.I don't like manually removing the CLI flag in our build script, but maybe? https://github.com/google/oss-fuzz/blob/66e0e379395d3a3ee741b08f4d306e1ed71c4003/infra/base-images/base-clang/Dockerfile

Testing: bazel build //test/common/router:route_fuzz_test_with_libfuzzer --config=asan-fuzzer

Signed-off-by: Asra Ali <asraa@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants