DOC Adds feature_names_in_ to docstrings#20787
DOC Adds feature_names_in_ to docstrings#20787lorentzenchr merged 17 commits intoscikit-learn:mainfrom
Conversation
There was a problem hiding this comment.
I have locally extended check_dataframe_column_names_consistency to also check the docstring of the estimator when feature_names_in_ and found the following classes that miss it:
dev ❯ pytest sklearn/tests/test_common.py -v -k test_pandas_column_name_consistency | grep FAILED
sklearn/tests/test_common.py::test_pandas_column_name_consistency[CategoricalNB()] FAILED [ 7%]
sklearn/tests/test_common.py::test_pandas_column_name_consistency[EmpiricalCovariance()] FAILED [ 14%]
sklearn/tests/test_common.py::test_pandas_column_name_consistency[MaxAbsScaler()] FAILED [ 49%]
sklearn/tests/test_common.py::test_pandas_column_name_consistency[Perceptron()] FAILED [ 71%]
sklearn/tests/test_common.py::test_pandas_column_name_consistency[QuantileRegressor()] FAILED [ 74%]
sklearn/tests/test_common.py::test_pandas_column_name_consistency[SGDOneClassSVM()] FAILED [ 83%]
sklearn/tests/test_common.py::test_pandas_column_name_consistency[SelectFpr()] FAILED [ 87%]
sklearn/tests/test_common.py::test_pandas_column_name_consistency[SplineTransformer()] FAILED [ 96%]
dev ❯ git diff
diff --git a/sklearn/utils/estimator_checks.py b/sklearn/utils/estimator_checks.py
index 8a51d8768..a7f135213 100644
--- a/sklearn/utils/estimator_checks.py
+++ b/sklearn/utils/estimator_checks.py
@@ -3702,6 +3702,12 @@ def check_dataframe_column_names_consistency(name, estimator_orig):
)
assert_array_equal(estimator.feature_names_in_, names)
+ if "feature_names_in_" not in estimator.__doc__:
+ raise ValueError(
+ "Estimator does not document its feature_names_in_ "
+ "attribute"
+ )
+
check_methods = []
for method in (
"predict",I am not sure if we should commit this extension to check the docstring as part of check_dataframe_column_names_consistency.
I agree it would be good to have a docstring check. I updated this PR with a check that only does the docstring test for sklearn modules: # Only check sklearn estimators for feature_names_in_ in docstring
if estimator_orig.__module__.startswith("sklearn.") and (
"feature_names_in_" not in estimator.__doc__
):
raise ValueError("Estimator does not document its feature_names_in_ attribute") |
There was a problem hiding this comment.
LGTM! Thanks very much @thomasjpfan.
I just pushed 2 commits to allow test estimators without docstrings and tell codecov that it's expected that we don't raise the exception when testing scikit-learn.
|
@lorentzenchr I think it would be great to have this merged to benefit from the extended estimator check when working on the remaining modules in other PRs. |
|
Since I merged some related PRs in |
|
\o/ |
…_new add backticks (scikit-learn#20914), deprecate **params in fit (scikit-learn#20843), add feature_names_in_ (scikit-learn#20787)
…_new add backticks (scikit-learn#20914), deprecate **params in fit (scikit-learn#20843), add feature_names_in_ (scikit-learn#20787) uncompromisingly reformat plot_oneclass_vs_svdd with black
…_new add backticks (scikit-learn#20914), deprecate **params in fit (scikit-learn#20843), add feature_names_in_ (scikit-learn#20787) uncompromisingly reformat plot_oneclass_vs_svdd with black
…_new add backticks (scikit-learn#20914), deprecate **params in fit (scikit-learn#20843), add feature_names_in_ (scikit-learn#20787) uncompromisingly reformat plot_oneclass_vs_svdd with black
…_new add backticks (scikit-learn#20914), deprecate **params in fit (scikit-learn#20843), add feature_names_in_ (scikit-learn#20787) uncompromisingly reformat plot_oneclass_vs_svdd with black
…_new add backticks (scikit-learn#20914), deprecate **params in fit (scikit-learn#20843), add feature_names_in_ (scikit-learn#20787) uncompromisingly reformat plot_oneclass_vs_svdd with black
Reference Issues/PRs
Follow up to #18010
What does this implement/fix? Explain your changes.
This PR adds
feature_names_in_to the docstrings of estimators that already have them. (I place the new entry undern_features_in_.Any other comments?