Skip to content

Optimize the implementation of rcutils_char_array_strncpy.#367

Merged
clalancette merged 1 commit intorollingfrom
clalancette/optimize-char-array-strncpy
Jul 15, 2022
Merged

Optimize the implementation of rcutils_char_array_strncpy.#367
clalancette merged 1 commit intorollingfrom
clalancette/optimize-char-array-strncpy

Conversation

@clalancette
Copy link
Copy Markdown
Contributor

It turns out that this function is used a lot in the rcutils
logging code, so it is on a hot-path. Thus it is worth optimizing.

The optimization itself consists of recognizing that the current
implementation of rcutils_char_array_strncpy is calling strlen
twice; once explicitly at the beginning of the function, and then
again implicitly while calling strncat. However, it turns out
that we don't need to recalculate the length either time; we
already know that information based on the buffer_length. Thus
we use the information we already have and speed up the calculation,
particularly for long strings.

Signed-off-by: Chris Lalancette clalancette@openrobotics.org

In my local testing, I found (using perf) that this patch significantly cuts down on the number of strlen invocations. For very long strings, this can be up to 40% faster (for shorter strings, the speed increase is more modest, around 10%).

I'll note that this also fixes a subtle bug in the interaction of rcutils_char_array_memcpy and rcutils_char_array_strncat. Previously, if you ended up calling rcutils_char_array_memcpy and did not copy a '\0' in, then a following call to rcutils_char_array_strncat would run off into arbitrary memory looking for a '\0' during the strlen call. While this sequence of calls would be a strange usage, it would cause UB before. Now it would work properly.

It turns out that this function is used a lot in the rcutils
logging code, so it is on a hot-path.  Thus it is worth optimizing.

The optimization itself consists of recognizing that the current
implementation of rcutils_char_array_strncpy is calling strlen
twice; once explicitly at the beginning of the function, and then
again implicitly while calling strncat.  However, it turns out
that we don't need to recalculate the length either time; we
already know that information based on the buffer_length.  Thus
we use the information we already have and speed up the calculation,
particularly for long strings.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
@clalancette
Copy link
Copy Markdown
Contributor Author

Full CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Copy link
Copy Markdown
Member

@audrow audrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@clalancette
Copy link
Copy Markdown
Contributor Author

clalancette commented Jul 13, 2022

Just because this is touching such a low-level piece of code, here is CI on RHEL and Windows Debug as well:

  • RHEL Build Status
  • Windows Debug Build Status

@clalancette
Copy link
Copy Markdown
Contributor Author

The RHEL failure is a known flake. With everything else being green, I'm going to go ahead and merge this.

@clalancette clalancette merged commit 8564fa1 into rolling Jul 15, 2022
@delete-merged-branch delete-merged-branch bot deleted the clalancette/optimize-char-array-strncpy branch July 15, 2022 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants