Skip to content

8343110: Add getChars(int, int, char[], int) to CharSequence and CharBuffer#21730

Closed
mkarg wants to merge 17 commits intoopenjdk:masterfrom
mkarg:getchars
Closed

8343110: Add getChars(int, int, char[], int) to CharSequence and CharBuffer#21730
mkarg wants to merge 17 commits intoopenjdk:masterfrom
mkarg:getchars

Conversation

@mkarg
Copy link
Contributor

@mkarg mkarg commented Oct 26, 2024

This Pull Request proposes an implementation for JDK-8343110: Adding the new method public void getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin) to the CharSequence interface, providing a bulk-read facility including a default implementation iterating over charAt(int).

In addition, this Pull Request proposes to replace the implementation of Reader.of(CharSequence).read(char[] cbuf, int off, int len) to invoke CharSequence.getChars(next, next + n, cbuf, off) instead of utilizing pattern matching for switch. Also, this PR proposes to implement CharBuffer.getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin) as an alias for CharBuffer.get(srcBegin, dst, dstBegin, srcEnd - srcBegin).

To ensure quality...

  • ...the method signature and JavaDocs are adapted from AbstractStringBuilder.getChars(...).
  • ...this PR relies upon the existing tests for Reader.of(CharSequence), as these provide sufficient coverage of all changes introduced by this PR.

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8343111 to be approved
  • Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issues

  • JDK-8343110: Add getChars(int, int, char[], int) to CharSequence and CharBuffer (Enhancement - P4)
  • JDK-8343111: Add getChars(int, int, char[], int) to CharSequence and CharBuffer (CSR)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21730/head:pull/21730
$ git checkout pull/21730

Update a local copy of the PR:
$ git checkout pull/21730
$ git pull https://git.openjdk.org/jdk.git pull/21730/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 21730

View PR using the GUI difftool:
$ git pr show -t 21730

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21730.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 26, 2024

👋 Welcome back mkarg! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 26, 2024

@mkarg This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8343110: Add getChars(int, int, char[], int) to CharSequence and CharBuffer

Reviewed-by: liach, jpai, rriggs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 35 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@liach, @jaikiran, @RogerRiggs) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk
Copy link

openjdk bot commented Oct 26, 2024

@mkarg The following labels will be automatically applied to this pull request:

  • core-libs
  • nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added nio nio-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels Oct 26, 2024
@mkarg mkarg mentioned this pull request Oct 26, 2024
4 tasks
@mkarg
Copy link
Contributor Author

mkarg commented Oct 26, 2024

/csr

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Oct 26, 2024
@openjdk
Copy link

openjdk bot commented Oct 26, 2024

@mkarg has indicated that a compatibility and specification (CSR) request is needed for this pull request.

@mkarg please create a CSR request for issue JDK-8343110 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.

@mkarg
Copy link
Contributor Author

mkarg commented Oct 26, 2024

FYI @liach @bokken @eirbjo @robtimus @parttimenerd

@liach
Copy link
Member

liach commented Oct 26, 2024

Sorry for belated mail response, but I think we should design the API to not take source start/end. I think JIT can escape analysis the new String in practice.

@mkarg
Copy link
Contributor Author

mkarg commented Oct 26, 2024

Sorry for belated mail response, but I think we should design the API to not take source start/end. I think JIT can escape analysis the new String in practice.

Chen, thank you for chiming in! There is nothing to be sorry, I was just posting drafts so far! 🙂

Regarding your actual question, I do understand your idea and while originally I had the same in mind (it really is appealing!), I came up with a draft using the original String.getChars() signature instead, due to the following drawbacks:

  • There might exist (possibly lotsof) CharSequence.getChars(int, int, char[], int) implementations already, as this problem (and the idea how to solve it) is anything but new. At least such implementations are String, StringBuilder and StringBuffer. If we come up with a different signature, then none of these already existing performance boosters will get used by Reader.of(CharSequence) automatically - at least until they come up with alias methods. Effectively this leads to (possibly lots) of alias methods. At least it leads to alias methods in String, StringBuilder, StringBuffer and CharBuffer. In contrast, when keeping the signature copied from String.getChars, chances are good that (possibly lots of) implementations will instantly be supported by Reader.of(CharSequence) without alias methods. At least, String, StringBuilder and StringBuffer will be.
  • Since decades people are now very used to StringBuilder.getChars(int, int, char[], int), so (possibly a lot of) people might simply expect us to come up with that lengthy signature. These people might be rather confused (if not to say frustrated) when we now force them to write an intermediate subSequence(int, int) for something that was "such simple" before.
  • Custom implementations of CharSequence.subSequence could come up with the (performance-wise "bad") idea of creating copies instead of views. At least it seems like AbstractStringBuilder is doing that, so chances are "good" that custom libs will do that, too. For example, because they need it for safety. Or possibly, because they have a technical reason that enforces a copy. That would (possibly massively, depending on the actual class) spoil the idea of performance-boosting this PR is actually all about.

@liach
Copy link
Member

liach commented Oct 26, 2024

Hi Markus, I recommend to continue the discussion on the mailing list instead of on GitHub. Note that GitHub pull requests are only forwarded to the mailing list when it's ready for review, and this proposal is way too early. (And the CSR is too early, too: we should agree on an API surface, such as exception contracts, first)

Sorry that no one has responded to you yet, but many engineers are busy with other areas, such as pushing the JEPs (as you see, we have a huge number of candidate/submitted JEPs right now, and JDK 24 RDP1 is just 6 weeks away). They are monitoring your core-libs thread, stay assured!

@mkarg
Copy link
Contributor Author

mkarg commented Oct 26, 2024

@liach Agreed with all you say, but actually: I did not expect any response - in particular on a weekend!

I do not know why apparently people started to review the draft documents. I just filed it for no other purpose than taking note of its existence. 🤔 I thought it is pretty clear to everybody that draft means "do not review". At least this is how I used draft documents in the past 30 years. 🙂

So it seems there was a misunderstanding here regarding my drafts. My intention ever was and still is to first finish the discussion about alternatives (A)...(E) on the core-lib mailing list first. My PR and CSR drafts are intentionally in draft state, which means, they are not open for review yet (there is no rfr label) and solely serve as my public storage for ideas-in-progress in lieu of out-of-band media. The detail discussion you started regarding details of the PR was a bit too early IMHO (and surprised me, and actually your duplicate question in the PR confused me), as it is not even decided that we actually go with proposal A, so why discussing details of exactly that already (my conclusion was, you like to prepare it for the case it becomes the finalist of the mailing list discussion)? Maybe I misunderstood your intent, and you actually wanted to propose alternative (F), then sorry for me not seeing that in the first place.

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 21, 2024

@mkarg This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@mkarg
Copy link
Contributor Author

mkarg commented Dec 22, 2024

Kindly asking anybody's contribution to the ongoing discussion on the core-devs mailing list. Thanks you.

@openjdk
Copy link

openjdk bot commented Feb 9, 2025

@mkarg this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout getchars
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Feb 9, 2025
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Feb 9, 2025
@mkarg mkarg marked this pull request as ready for review February 9, 2025 18:37
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 9, 2025
@mlbridge
Copy link

mlbridge bot commented Feb 9, 2025

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 9, 2025

@mkarg This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@mkarg
Copy link
Contributor Author

mkarg commented Mar 15, 2025

Regarding your actual question, I do understand your idea and while originally I had the same in mind (it really is appealing!), I came up with a draft using the original String.getChars() signature instead, due to the following drawbacks:

  • There might exist (possibly lotsof) CharSequence.getChars(int, int, char[], int) implementations already, as this problem (and the idea how to solve it) is anything but new. At least such implementations are String, StringBuilder and StringBuffer. If we come up with a different signature, then none of these already existing performance boosters will get used by Reader.of(CharSequence) automatically - at least until they come up with alias methods. Effectively this leads to (possibly lots) of alias methods. At least it leads to alias methods in String, StringBuilder, StringBuffer and CharBuffer. In contrast, when keeping the signature copied from String.getChars, chances are good that (possibly lots of) implementations will instantly be supported by Reader.of(CharSequence) without alias methods. At least, String, StringBuilder and StringBuffer will be.

  • Since decades people are now very used to StringBuilder.getChars(int, int, char[], int), so (possibly a lot of) people might simply expect us to come up with that lengthy signature. These people might be rather confused (if not to say frustrated) when we now force them to write an intermediate subSequence(int, int) for something that was "such simple" before.

  • Custom implementations of CharSequence.subSequence could come up with the (performance-wise "bad") idea of creating copies instead of views. At least it seems like AbstractStringBuilder is doing that, so chances are "good" that custom libs will do that, too. For example, because they need it for safety. Or possibly, because they have a technical reason that enforces a copy. That would (possibly massively, depending on the actual class) spoil the idea of performance-boosting this PR is actually all about.

Kindly asking for more comments! Something I am missing? Some good reason to not adopt my proposal?

@openjdk openjdk bot removed the csr Pull request needs approved CSR before integration label May 5, 2025
@jaikiran
Copy link
Member

jaikiran commented May 6, 2025

Hello Markus, it's been a while since this PR was merged with latest master branch in mainline. Could you update the PR to do that?

@mkarg
Copy link
Contributor Author

mkarg commented May 6, 2025

Hello Markus, it's been a while since this PR was merged with latest master branch in mainline. Could you update the PR to do that?

As rebasing is not wanted in OpenJDK, do you mean merging master branch into this branch?

@jaikiran
Copy link
Member

jaikiran commented May 6, 2025

do you mean merging master branch into this branch?

Yes, that's correct. That should then run the GitHub actions job against this PR against a more relevant state of this PR.

Since things have settled and the CSR approved, I'll also run this PR against our CI to verify nothing fails unexpectedly. I will anyway be doing the merge against master locally before submitting for tier testing, but it's always a good practice to keep the PR updated with latest master changes, when the master branch has seen too many commits since the PR was opened.

@mkarg
Copy link
Contributor Author

mkarg commented May 6, 2025

do you mean merging master branch into this branch?

Yes, that's correct. That should then run the GitHub actions job against this PR against a more relevant state of this PR.

Since things have settled and the CSR approved, I'll also run this PR against our CI to verify nothing fails unexpectedly. I will anyway be doing the merge against master locally before submitting for tier testing, but it's always a good practice to keep the PR updated with latest master changes, when the master branch has seen too many commits since the PR was opened.

Yes that is pretty clear. I'm using git since decades but always do rebase in all other project, so my question was only about how to perform the update. 🙂

@jaikiran
Copy link
Member

jaikiran commented May 6, 2025

so my question was only about how to perform the update.

Locally, if your remote repo that points to github.com/openjdk/jdk is named upstream and the local branch from which you have created this PR is named getchars, then you would do something like the following:

# fetch latest changes from master branch of github.com/openjdk/jdk
git fetch upstream master
# update your local workspace to your local getchars branch (if you aren't already on it)
git switch getchars
# merge the changes from upstream master branch into this local getchars branch
git merge -m "merge latest from master branch" upstream/master

If there are no merge resolution conflicts, then those command should be enough and your local getchars branch should now have the upstream master changes.

Once you build and test locally you can then push those changes from getchars branch to your forked remote repo's getchars branch (and thus automatically update this PR) using:

# while being on getchars branch locally and your forked remote is named mkarg-remote
git push mkarg-remote getchars

@mkarg
Copy link
Contributor Author

mkarg commented May 6, 2025

I know all that. As I said, I am using git since decades, so I know lots of alternative ways to update branches and even proposed the one you want, so I just needed to know which of those alternatives (not the exact commands) you want. Don't know why posted all that, as the question was already solved by you?! 🤔

@mkarg
Copy link
Contributor Author

mkarg commented May 6, 2025

Hello Markus, it's been a while since this PR was merged with latest master branch in mainline. Could you update the PR to do that?

Updated to current master. Builds fine locally. Pushed to Github. Github Actions currently are running.

@mkarg
Copy link
Contributor Author

mkarg commented May 7, 2025

Github Actions have successfully passed all checks. ✔️ 😄

Copy link
Member

@jaikiran jaikiran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tier1 through tier3 testing of this PR passed without issues. The CSR has been approved and the changes in this PR look good to me.

Thank you for being patient and contributing this enhancement.

@mkarg
Copy link
Contributor Author

mkarg commented May 7, 2025

tier1 through tier3 testing of this PR passed without issues. The CSR has been approved and the changes in this PR look good to me.

Thank you, Jaikiran! Will this result convince everybody to actively mark this PR as officially reviewed, so I can finally integrate it? 😃

@mkarg
Copy link
Contributor Author

mkarg commented May 7, 2025

Thank you, Jaikiran! Will this result convince everybody to actively mark this PR as officially reviewed, so I can finally integrate it? 😃

@liach For example, you could approve your outdated approval so this PR finally has to current reviews. Thanks.

Chen Liang (@liach - Reviewer) 🔄 Re-review required (review applies to 884e7dc1)

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the persistence.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 7, 2025
@mkarg
Copy link
Contributor Author

mkarg commented May 7, 2025

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label May 7, 2025
@openjdk
Copy link

openjdk bot commented May 7, 2025

@mkarg
Your change (at version 31537b7) is now ready to be sponsored by a Committer.

Copy link
Member

@liach liach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest iteration looks the same as the one I last reviewed.

@jaikiran
Copy link
Member

jaikiran commented May 8, 2025

/sponsor

@openjdk
Copy link

openjdk bot commented May 8, 2025

Going to push as commit 7642556.
Since your change was applied there have been 42 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 8, 2025
@openjdk openjdk bot closed this May 8, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels May 8, 2025
@openjdk
Copy link

openjdk bot commented May 8, 2025

@jaikiran @mkarg Pushed as commit 7642556.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@mkarg mkarg deleted the getchars branch May 8, 2025 06:50
* characters are copied into the subarray of {@code dst} starting
* at index {@code dstBegin} and ending at index:
* <pre>{@code
* dstbegin + (srcEnd-srcBegin) - 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be dstBegin?

* dstBegin + (srcEnd-srcBegin) - 1

Copy link
Member

@jaikiran jaikiran May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Andrey, what you note is right. This and the other change you have proposed to this text seems reasonable. Do you want to create a JBS issue and raise a PR proposing this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also simply include it in the PR for https://bugs.openjdk.org/browse/JDK-8356679 to reduce organizational overhead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Markus, it's OK to do this text change as part of JDK-8356679. Like Roger noted in the corresponding core-libs-dev mailing list, it would be good to do the 8356679 changes in smaller pieces and include this change in one of those.

* <li>{@code srcEnd} is greater than
* {@code this.length()}.
* <li>{@code dstBegin+srcEnd-srcBegin} is greater than
* {@code dst.length}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have . at the end of each case. Now it's inconsistent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated nio nio-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

9 participants