Optimize the implementation of rcutils_char_array_strncpy.#367
Merged
clalancette merged 1 commit intorollingfrom Jul 15, 2022
Merged
Optimize the implementation of rcutils_char_array_strncpy.#367clalancette merged 1 commit intorollingfrom
clalancette merged 1 commit intorollingfrom
Conversation
It turns out that this function is used a lot in the rcutils logging code, so it is on a hot-path. Thus it is worth optimizing. The optimization itself consists of recognizing that the current implementation of rcutils_char_array_strncpy is calling strlen twice; once explicitly at the beginning of the function, and then again implicitly while calling strncat. However, it turns out that we don't need to recalculate the length either time; we already know that information based on the buffer_length. Thus we use the information we already have and speed up the calculation, particularly for long strings. Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
Contributor
Author
Contributor
Author
Contributor
Author
|
The RHEL failure is a known flake. With everything else being green, I'm going to go ahead and merge this. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
It turns out that this function is used a lot in the rcutils
logging code, so it is on a hot-path. Thus it is worth optimizing.
The optimization itself consists of recognizing that the current
implementation of rcutils_char_array_strncpy is calling strlen
twice; once explicitly at the beginning of the function, and then
again implicitly while calling strncat. However, it turns out
that we don't need to recalculate the length either time; we
already know that information based on the buffer_length. Thus
we use the information we already have and speed up the calculation,
particularly for long strings.
Signed-off-by: Chris Lalancette clalancette@openrobotics.org
In my local testing, I found (using
perf) that this patch significantly cuts down on the number of strlen invocations. For very long strings, this can be up to 40% faster (for shorter strings, the speed increase is more modest, around 10%).I'll note that this also fixes a subtle bug in the interaction of
rcutils_char_array_memcpyandrcutils_char_array_strncat. Previously, if you ended up callingrcutils_char_array_memcpyand did not copy a '\0' in, then a following call torcutils_char_array_strncatwould run off into arbitrary memory looking for a '\0' during the strlen call. While this sequence of calls would be a strange usage, it would cause UB before. Now it would work properly.