In metrics, only show cumulative number of page faults#7482
Merged
Conversation
timvisee
commented
Oct 30, 2025
5d28877 to
7e160bb
Compare
7e160bb to
413b91a
Compare
timvisee
commented
Oct 31, 2025
f77cc9a to
faadcc3
Compare
timvisee
commented
Oct 31, 2025
b1b7ce1 to
faa7132
Compare
JojiiOfficial
approved these changes
Nov 5, 2025
Merged
xzfc
approved these changes
Nov 10, 2025
src/common/metrics.rs
Outdated
Comment on lines
+1004
to
+1006
| .tasks()? | ||
| .filter_map(|task| { | ||
| let children = task.and_then(|task| task.children()); |
Member
There was a problem hiding this comment.
TIL we can get a list of children PIDs directly via /proc/*/task/*/children.
The classic approach was to read /proc/*/stat for every process and rebuild the process tree by looking at parent PIDs.
The /children approach requires kernel with CONFIG_PROC_CHILDREN. I think it should be enabled practically everywhere on servers, but not sure.
timvisee
added a commit
that referenced
this pull request
Nov 14, 2025
* In metrics, aggregate page faults * Include page fault of active children (recursively) * type alias Pid: i32 * Code improvements * Ignore already terminated processes * Clippy * Also count faults for joined descendant threads in children * Import std::cmp::min * Rework child fault recursion, remove second hash set * Add simple test for iterating child processes * Fix typo Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Use limit values directly * We don't include the parent PID * Don't allocate hashmap when listing process children * Extend process child PIDs test, fork recursively --------- Co-authored-by: jojii <jojii@gmx.net> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In metrics, only expose a cumulative number of minor and major page faults. Remove the metrics for page faults in joined children:
In my opinion, we should only expose the above two. I see no point in reporting separate values for children. In fact, it is currently incorrectly implemented, more on that later.
In practice we're only interested in the cumulative number of page faults that happen during the life time of the Qdrant process, in the process itself and in all its children. If the number grows faster than we expect, the cluster may need attention.
The minor/major metric needs to be a sum of:
Actually, Qdrant currently does not even spawn any child processes at all. It means that the child metrics will always be 0 at this time.(Might be wrong of this, but the fact remains that we currently don't report it correctly)But, to make our reporting more future proof I still recommend to implement it properly. For point two we'd have to iterate over all child processes recursively and fetch its page fault stats (left a TODO). Point three is a different number and we've already implemented this.
All Submissions:
devbranch. Did you create your branch fromdev?