DOC: clarify missing-value handling in pandas and NumPy reductions#65441
Conversation
|
The failing check appears to be a CI environment/setup issue rather than a docs failure. The job failed in “Post Set up Conda” with: ENOENT: no such file or directory, lstat '/home/runner/work/_temp/setup-micromamba/micromamba-shell' Local validation passed:
|
rhshadrach
left a comment
There was a problem hiding this comment.
This looks good, but to close out the issue we'd also need to update the API docs for DataFrame.std to read somethign like
Notes
-----
To have the same behaviour as ``numpy.std``, use ``ddof=0`` (instead of the
default ``ddof=1``) and ``skipna=False``.Note I'm also fixing the single backticks here.
In addition, I think we should also add this note to all std methods (there are many - can search the codebase for def std()
| <api.series.stats>` and :ref:`here <api.dataframe.stats>`) all | ||
| account for missing data. | ||
|
|
||
| This default differs from many NumPy reduction functions. For example, |
There was a problem hiding this comment.
I don't think it's clear here what "This default" refers to. Maybe change to "The default behavior"?
42f5034 to
4fe7196
Compare
|
@rhshadrach Thanks for the review. I addressed the requested changes by:
Validation:
|
|
Thanks @praneethhere |
Closes #56939.
This PR adds a short note to the missing-data user guide explaining that pandas reductions skip missing values by default, while NumPy reductions such as numpy.std return nan when the input contains nan values.
The change is documentation-only and does not modify pandas behavior.
Validation: