feat(medcat):CU-869cgny1k Add pipe speed options#369
Merged
Conversation
Member
alhendrickson
approved these changes
Mar 17, 2026
alhendrickson
approved these changes
Mar 17, 2026
Collaborator
alhendrickson
left a comment
There was a problem hiding this comment.
Looks great, thanks!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds a few options for inspecting speed of the pipe.
The idea is to add it in a way that doesn't interfere with the regular inference. I.e something with 0 overhead during normal operation yet close enough to regular use cases.
The added methods are:
medcat.pipeline.speed_utils.pipeline_per_doc_timermedcat.pipeline.speed_utils.pipeline_timer_averaging_docsmedcat.pipeline.speed_utils.profile_pipeline_componentThe idea is that you first use one of the first two to figure out which component is taking the most time.
And then you profiile that component to figure out why it's slower than expected (if that is the case).
EDIT:
I've also added an option to provide your own timer into
pipeline_per_doc_timerfor more customisability. This can be whatever that follows the same base interface so (effectively) this approach could be used to replace the other 2 methods as well.Example code / usage
Example output