add rational why ament_index pkgs don't have explicit performance tests#65
Conversation
Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>
|
I believe with these justifications, this package can be considered level 1 😄 If that's indeed the case, do you mind updating the QD accordingly? |
I would rather get this PR approved and merged first and then do a follow up PR to bump the level (which needs a separate round of review if all criteria are satisfied). |
There was a problem hiding this comment.
While this high-level description is accurate, I don't see how it explains performance testing is not necessary. REP-2004 does not specify when testing for performance is a requirement, but it does depict performance testing as a form of regression testing, which IMHO leaves little room for arguing against it.
Perhaps if we were to say that ament_index file structure will not change, in addition to IO access being the main bottleneck, we can argue that changing the query logic itself won't impact performance.
| ### Performance [4.iv] | ||
|
|
||
| `ament_index_cpp` does not conduct performance tests. | ||
| An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths. |
There was a problem hiding this comment.
@dirk-thomas meta: one could argue that we should have tests that backup that claim.
There was a problem hiding this comment.
How would you create a test to actually do that? What kind of thresholds would you use?
There was a problem hiding this comment.
I'd probably benchmark queries with increasing values of n and check if the result of a linear regression reasonably (within one or two standard deviations) explains the measurements. We would need a big index to see through OS noise though, I give you that.
And actually, now that I think about it, perhaps that's the only kind of performance test that would make sense here.
There was a problem hiding this comment.
Measuring enough different N trying to figure out if the complexity is linear as promised would be possible. (Even though this sounds extremely costly to me.)
| An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths. | ||
| The time complexity to query information is either scaling lineraly with the number of resource types or with the number of resources per type (depending on which dimension is requested). | ||
| If the content of a specific resource is retrieved the time complexity is linear to the size of the content as is the memory usage in that case since the content is returned to the caller. | ||
| The performance of the implementation is defined by the performance of the underlying filesystem functions and the implemented logic doesn't add any significant overhead. |
There was a problem hiding this comment.
@dirk-thomas meta: while I agree the logic overhead is likely negligible compared to that of IO, the performance of ament_index queries has more to do with the way the index is structured than with the code itself.
There was a problem hiding this comment.
That is why I tried to describe the complexity of the different kind of queries available.
Arguably if there would be a more efficient way the resource index could be structured to make queries more efficient (which still satisfies the requirements) that might be something better discussed on the design document (https://github.com/ament/ament_cmake/blob/master/ament_cmake_core/doc/resource_index.md).
There was a problem hiding this comment.
Arguably if there would be a more efficient way the resource index could be structured to make queries more efficient
Sure, my point being that to say The performance of the implementation is defined by the performance of the underlying filesystem functions is not entirely correct.
There was a problem hiding this comment.
Can you elaborate why?
There was a problem hiding this comment.
Re-reading this thread, I get the impression that we're using the term performance for different things (algorithmic complexity vs. runtime cost). Since you do mention time-complexity, perhaps rephrasing as:
| The performance of the implementation is defined by the performance of the underlying filesystem functions and the implemented logic doesn't add any significant overhead. | |
| The runtime cost of the implementation is dominated by the runtime cost of the underlying filesystem API, and the implemented logic doesn't add any significant overhead. |
would constrain interpretation. WDYT?
There was a problem hiding this comment.
Sounds good to me - applied in both locations in d0dbd70.
Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>
dirk-thomas
left a comment
There was a problem hiding this comment.
While this high-level description is accurate, I don't see how it explains performance testing is not necessary. REP-2004 does not specify when testing for performance is a requirement, but it does depict performance testing as a form of regression testing, which IMHO leaves little room for arguing against it.
I can't speak for the intention of the REP since I wasn't closely involved in its creation. This is basically something the team has to decide: is it worth / necessary / useful to cover this API with performance tests (and if the answer is yes: what exactly should the performance test actually check for - at least to me that is not obvious).
| ### Performance [4.iv] | ||
|
|
||
| `ament_index_cpp` does not conduct performance tests. | ||
| An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths. |
There was a problem hiding this comment.
How would you create a test to actually do that? What kind of thresholds would you use?
| An environment variable defines the prefix paths of such resource indices and the API has a time complexity of `O(n)` where `n` is the number of prefix paths. | ||
| The time complexity to query information is either scaling lineraly with the number of resource types or with the number of resources per type (depending on which dimension is requested). | ||
| If the content of a specific resource is retrieved the time complexity is linear to the size of the content as is the memory usage in that case since the content is returned to the caller. | ||
| The performance of the implementation is defined by the performance of the underlying filesystem functions and the implemented logic doesn't add any significant overhead. |
There was a problem hiding this comment.
That is why I tried to describe the complexity of the different kind of queries available.
Arguably if there would be a more efficient way the resource index could be structured to make queries more efficient (which still satisfies the requirements) that might be something better discussed on the design document (https://github.com/ament/ament_cmake/blob/master/ament_cmake_core/doc/resource_index.md).
Agreed. I cannot say if it's worth or not (though probably isn't compared to the rest of the system), but I honestly can't come up with any other way to justify dropping performance regression testing than to say we won't be changing the main algorithm any time soon. |
|
Another thought about whether to test this for performance or not. The problem is how REP-2004 frames performance testing. Perhaps we shouldn't be trying to justify testing or not for performance regressions in every package, but putting together a set of concrete use cases and benchmarking critical paths. |
What's the problem? It just says you need a policy. Your policy can be that you will not do performance testing or block releases based on their results. You just need to justify it. Do you have a specific part of the REP in mind that is problematic? |
|
Quote:
On what basis can we justify that this package should not test for performance degradation? This patch describes FWIW I'm playing devil's advocate here on purpose, trying to make the same inquires a third-party that's interested in this Quality Declaration would make. Specially because IIUC this patch will be adapted to many other packages. Do not block on my concerns if you think that's not to worry about or that it is sufficient as it stands. |
|
I think the core argument is that the answer to "if performance is a reasonable concern for use in a production system" is "no" for this package due to the way it is implemented and the places it is used. |
|
Ok, fair enough. Let's hinge on this is not likely to be part of a critical path in runtime. |
wjwwood
left a comment
There was a problem hiding this comment.
I think this justification is ok, the two main takeaways I get are this:
ament_index_cppdoes not need performance tests because:- the algorithms implemented here are not at high risk for performance degradation
- (more importantly)
ament_index_cppis not used in performance critical parts of the ecosystem right now
If in the future new algorithms are introduced, the first point could be reviewed, and if in the future we start using this in performance critical parts of the stack then we can reconsider it.
I hope the current statement: covers the important reason well enough. Thanks for the feedback and discussion! |
This is a first draft for a rational why performance tests aren't necessary for
ament_index_*packages.