add rational why ament_index pkgs don't have explicit performance tests by dirk-thomas · Pull Request #65 · ament/ament_index

dirk-thomas · 2020-08-06T21:10:40Z

This is a first draft for a rational why performance tests aren't necessary for ament_index_* packages.

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>

chapulina · 2020-08-13T15:59:02Z

I believe with these justifications, this package can be considered level 1 😄 If that's indeed the case, do you mind updating the QD accordingly?

dirk-thomas · 2020-08-13T16:58:56Z

do you mind updating the QD accordingly?

I would rather get this PR approved and merged first and then do a follow up PR to bump the level (which needs a separate round of review if all criteria are satisfied).

hidmic

While this high-level description is accurate, I don't see how it explains performance testing is not necessary. REP-2004 does not specify when testing for performance is a requirement, but it does depict performance testing as a form of regression testing, which IMHO leaves little room for arguing against it.

Perhaps if we were to say that ament_index file structure will not change, in addition to IO access being the main bottleneck, we can argue that changing the query logic itself won't impact performance.

ament_index_cpp/QUALITY_DECLARATION.md

hidmic · 2020-08-18T18:33:39Z

ament_index_cpp/QUALITY_DECLARATION.md

 ### Performance [4.iv]

-`ament_index_cpp` does not conduct performance tests.
+An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths.


@dirk-thomas meta: one could argue that we should have tests that backup that claim.

How would you create a test to actually do that? What kind of thresholds would you use?

I'd probably benchmark queries with increasing values of n and check if the result of a linear regression reasonably (within one or two standard deviations) explains the measurements. We would need a big index to see through OS noise though, I give you that.

And actually, now that I think about it, perhaps that's the only kind of performance test that would make sense here.

Measuring enough different N trying to figure out if the complexity is linear as promised would be possible. (Even though this sounds extremely costly to me.)

hidmic · 2020-08-18T18:44:07Z

ament_index_cpp/QUALITY_DECLARATION.md

+An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths.
+The time complexity to query information is either scaling lineraly with the number of resource types or with the number of resources per type (depending on which dimension is requested).
+If the content of a specific resource is retrieved the time complexity is linear to the size of the content as is the memory usage in that case since the content is returned to the caller.
+The performance of the implementation is defined by the performance of the underlying filesystem functions and the implemented logic doesn't add any significant overhead.


@dirk-thomas meta: while I agree the logic overhead is likely negligible compared to that of IO, the performance of ament_index queries has more to do with the way the index is structured than with the code itself.

That is why I tried to describe the complexity of the different kind of queries available.

Arguably if there would be a more efficient way the resource index could be structured to make queries more efficient (which still satisfies the requirements) that might be something better discussed on the design document (https://github.com/ament/ament_cmake/blob/master/ament_cmake_core/doc/resource_index.md).

Arguably if there would be a more efficient way the resource index could be structured to make queries more efficient

Sure, my point being that to say The performance of the implementation is defined by the performance of the underlying filesystem functions is not entirely correct.

Can you elaborate why?

Re-reading this thread, I get the impression that we're using the term performance for different things (algorithmic complexity vs. runtime cost). Since you do mention time-complexity, perhaps rephrasing as:

Suggested change

The performance of the implementation is defined by the performance of the underlying filesystem functions and the implemented logic doesn't add any significant overhead.

The runtime cost of the implementation is dominated by the runtime cost of the underlying filesystem API, and the implemented logic doesn't add any significant overhead.

would constrain interpretation. WDYT?

Sounds good to me - applied in both locations in d0dbd70.

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>

dirk-thomas

While this high-level description is accurate, I don't see how it explains performance testing is not necessary. REP-2004 does not specify when testing for performance is a requirement, but it does depict performance testing as a form of regression testing, which IMHO leaves little room for arguing against it.

I can't speak for the intention of the REP since I wasn't closely involved in its creation. This is basically something the team has to decide: is it worth / necessary / useful to cover this API with performance tests (and if the answer is yes: what exactly should the performance test actually check for - at least to me that is not obvious).

dirk-thomas · 2020-08-19T04:15:38Z

ament_index_cpp/QUALITY_DECLARATION.md

 ### Performance [4.iv]

-`ament_index_cpp` does not conduct performance tests.
+An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths.


How would you create a test to actually do that? What kind of thresholds would you use?

dirk-thomas · 2020-08-19T04:15:40Z

ament_index_cpp/QUALITY_DECLARATION.md

+An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths.
+The time complexity to query information is either scaling lineraly with the number of resource types or with the number of resources per type (depending on which dimension is requested).
+If the content of a specific resource is retrieved the time complexity is linear to the size of the content as is the memory usage in that case since the content is returned to the caller.
+The performance of the implementation is defined by the performance of the underlying filesystem functions and the implemented logic doesn't add any significant overhead.


That is why I tried to describe the complexity of the different kind of queries available.

Arguably if there would be a more efficient way the resource index could be structured to make queries more efficient (which still satisfies the requirements) that might be something better discussed on the design document (https://github.com/ament/ament_cmake/blob/master/ament_cmake_core/doc/resource_index.md).

hidmic · 2020-08-19T15:40:30Z

This is basically something the team has to decide: is it worth / necessary / useful to cover this API with performance tests.

Agreed. I cannot say if it's worth or not (though probably isn't compared to the rest of the system), but I honestly can't come up with any other way to justify dropping performance regression testing than to say we won't be changing the main algorithm any time soon.

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>

hidmic · 2020-08-19T19:18:05Z

Another thought about whether to test this for performance or not. The problem is how REP-2004 frames performance testing. Perhaps we shouldn't be trying to justify testing or not for performance regressions in every package, but putting together a set of concrete use cases and benchmarking critical paths.

wjwwood · 2020-08-19T20:23:53Z

The problem is how REP-2004 frames performance testing.

What's the problem? It just says you need a policy. Your policy can be that you will not do performance testing or block releases based on their results. You just need to justify it.

Do you have a specific part of the REP in mind that is problematic?

hidmic · 2020-08-19T22:10:04Z

Quote:

However, if performance is a reasonable concern for use in a production system, there must be performance tests and they should be used in conjunction with a regression policy which aims to prevent unintended performance degradation.

On what basis can we justify that this package should not test for performance degradation? This patch describes ament_index queries' algorithmic complexity and runtime cost, plus a note on how these APIs are likely to be used, but I don't see how "no need for performance tests" logically follows.

FWIW I'm playing devil's advocate here on purpose, trying to make the same inquires a third-party that's interested in this Quality Declaration would make. Specially because IIUC this patch will be adapted to many other packages. Do not block on my concerns if you think that's not to worry about or that it is sufficient as it stands.

wjwwood · 2020-08-20T15:09:35Z

I think the core argument is that the answer to "if performance is a reasonable concern for use in a production system" is "no" for this package due to the way it is implemented and the places it is used.

hidmic · 2020-08-20T15:14:03Z

Ok, fair enough. Let's hinge on this is not likely to be part of a critical path in runtime.

wjwwood

I think this justification is ok, the two main takeaways I get are this:

ament_index_cpp does not need performance tests because:
- the algorithms implemented here are not at high risk for performance degradation
- (more importantly) ament_index_cpp is not used in performance critical parts of the ecosystem right now

If in the future new algorithms are introduced, the first point could be reviewed, and if in the future we start using this in performance critical parts of the stack then we can reconsider it.

dirk-thomas · 2020-08-20T16:18:37Z

(more importantly) ament_index_cpp is not used in performance critical parts of the ecosystem right now

I hope the current statement:

From a usage point of view it is also expected that the resource index is commonly only queried during startup and not at runtime of a production system.

covers the important reason well enough. Thanks for the feedback and discussion!

add rational why ament_index pkgs don't have explicit performance tests

da538d3

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>

dirk-thomas added the documentation label Aug 6, 2020

dirk-thomas self-assigned this Aug 6, 2020

dirk-thomas requested a review from hidmic August 18, 2020 18:21

hidmic reviewed Aug 18, 2020

View reviewed changes

dirk-thomas removed their assignment Aug 18, 2020

fix spelling

567dc52

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>

dirk-thomas commented Aug 19, 2020

View reviewed changes

dirk-thomas added the more-information-needed Further information is required label Aug 19, 2020

feedback

d0dbd70

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>

hidmic approved these changes Aug 20, 2020

View reviewed changes

wjwwood approved these changes Aug 20, 2020

View reviewed changes

dirk-thomas merged commit fce2f68 into master Aug 20, 2020

dirk-thomas deleted the dirk-thomas/performance-test-rational branch August 20, 2020 16:19

dirk-thomas removed the more-information-needed Further information is required label Aug 20, 2020

	The performance of the implementation is defined by the performance of the underlying filesystem functions and the implemented logic doesn't add any significant overhead.
	The runtime cost of the implementation is dominated by the runtime cost of the underlying filesystem API, and the implemented logic doesn't add any significant overhead.

Conversation

dirk-thomas commented Aug 6, 2020

Uh oh!

chapulina commented Aug 13, 2020

Uh oh!

dirk-thomas commented Aug 13, 2020

Uh oh!

hidmic left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hidmic Aug 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dirk-thomas Aug 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dirk-thomas left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hidmic commented Aug 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hidmic commented Aug 19, 2020

Uh oh!

wjwwood commented Aug 19, 2020

Uh oh!

hidmic commented Aug 19, 2020

Uh oh!

wjwwood commented Aug 20, 2020

Uh oh!

hidmic commented Aug 20, 2020

Uh oh!

wjwwood left a comment

Choose a reason for hiding this comment

Uh oh!

dirk-thomas commented Aug 20, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hidmic left a comment •

edited

Loading

hidmic Aug 19, 2020 •

edited

Loading

dirk-thomas Aug 19, 2020 •

edited

Loading

hidmic commented Aug 19, 2020 •

edited

Loading