Skip to content

Add replaceAll and replaceFirst#19070

Merged
nik9000 merged 1 commit intoelastic:masterfrom
nik9000:painless_regex_augment
Jun 28, 2016
Merged

Add replaceAll and replaceFirst#19070
nik9000 merged 1 commit intoelastic:masterfrom
nik9000:painless_regex_augment

Conversation

@nik9000
Copy link
Copy Markdown
Member

@nik9000 nik9000 commented Jun 24, 2016

These are useful methods in groovy that give you control over
the replacements used:

'the quick brown fox'.replaceAll(/[aeiou]/,
        m -> m.group().toUpperCase(Locale.ROOT))

They mimick two of the augments groovy has for regexes.

@uschindler
Copy link
Copy Markdown
Contributor

@rmuir
Copy link
Copy Markdown
Contributor

rmuir commented Jun 24, 2016

I don't know enough about java's regex to know if Uwe's suggestion is strictly equivalent. But I like the PR.

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 24, 2016

Why not use the official loop for replacements in Matcher?

I figured those were less than desirable because they were StringBuffer instead of StringBuilder.

@uschindler
Copy link
Copy Markdown
Contributor

uschindler commented Jun 24, 2016

I figured those were less than desirable because they were StringBuffer instead of StringBuilder.

That's not an issue anymore. StringBuilder and StringBuffer are as fast in Java 6+. The introduction of StringBuilder was only needed in older Java versions. This is the reason why the APIs of Matcher were not changed afterwards (issues in OpenJDK closed).

So please don't reimplement the code and be safe and use the Matcher methods! If you look at String.replaceAll(String, String) (the regex one) and Matcher.replaceAll() source code you see the same loop: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/util/regex/Matcher.java#Matcher.replaceAll%28java.lang.String%29

In addition, the Java APIs allow to use \1 and \0 for groups, and can be "disabled" by quoteReplacement

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 26, 2016

That's not an issue anymore

Looks like it. Uncontended synchronization elimination works as expected. Another old habit gone.

In addition, the Java APIs allow to use \1 and \0 for groups, and can be "disabled" by quoteReplacement

I'll look into it.

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 26, 2016

I'll look into it.

And done. It seems silly that they didn't make quoteReplacement a boolean parameter but I guess this way covers slightly more cases.

@uschindler
Copy link
Copy Markdown
Contributor

Thanks.

Shouldn't we maybe allow \1 or $1 to allow references to groups (means remove quoteReplacement)? To me this is a major use-case or search/replace.

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 26, 2016

Not for the closure form. You can get them from the matcher. You can
certainly use replaceAll on the matcher if you want them.
On Jun 26, 2016 11:47 AM, "Uwe Schindler" notifications@github.com wrote:

Thanks.

Shouldn't we maybe allow \1 or $1 to allow references to groups (means
remove quoteReplacement)? To me this is a major use-case or
search/replace.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#19070 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AANLoq0sGr2e7u5zmZKBX3hQukhvdfI7ks5qPp8agaJpZM4I9_j3
.

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 27, 2016

My reasoning is that \1 makes sense in Matcher.replaceAll because you are sending it a string - but this version you send it a Function so if you want a capturing group you can just call m.group on it and have the capturing group. \1 feels like it'd just get in the way and be surprising. I could see someone having the opposite opinion though. For what it is worth, the method this is imitating in groovy doesn't support \1 either though the javadocs don't go into why.

@clintongormley
Copy link
Copy Markdown
Contributor

@nik9000 i want to be able to do things like:

/\b(?:\d{12}(\d{4}))\b/.replaceAll("...$1")

I have no idea what this would look like with your syntax?

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 27, 2016

I have no idea what this would look like with your syntax?

That is already there with us whitelisting Matcher#replaceAll

/\b(?:\d{12}(\d{4}))\b/.matcher("string").replaceAll("stuff$1")

This proposed syntax if for when you need to manipulate the captures more than just moving them around.

@rmuir
Copy link
Copy Markdown
Contributor

rmuir commented Jun 27, 2016

For what it is worth, the method this is imitating in groovy doesn't support \1 either though the javadocs don't go into why.

I think it would be confusing to diverge from that behavior. For a lot of reasons consistency with whatever it is doing will be easiest on everyone if we can have it.

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 27, 2016

@clintongormley I can add another example to the docs calling out Matcher#replaceAll's existence and $1 support and to describe this feature as for when you need "yet more control".

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 27, 2016

I can add another example to the docs calling out Matcher#replaceAll's existence and $1 support and to describe this feature as for when you need "yet more control".

I just added that to this PR.

@rmuir
Copy link
Copy Markdown
Contributor

rmuir commented Jun 28, 2016

What is the status here? this change looks good to me.

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Jun 28, 2016

What is the status here? this change looks good to me.

I'm not sure.... I think at this point I'll merge it and we can rework it later if it needs it?

These are useful methods in groovy that give you control over
the replacements used:
```
'the quick brown fox'.replaceAll(/[aeiou]/,
		m -> m.group().toUpperCase(Locale.ROOT))
```
@nik9000 nik9000 force-pushed the painless_regex_augment branch from a0ed47d to 67bfecc Compare June 28, 2016 20:44
@nik9000 nik9000 merged commit 67bfecc into elastic:master Jun 28, 2016
@clintongormley clintongormley changed the title Painless: add replaceAll and replaceFirst Add replaceAll and replaceFirst Jun 29, 2016
@clintongormley clintongormley added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed :Plugin Lang Painless labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement v5.0.0-alpha5

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants