[MRG] Added variable_name as a parameter for check_array#8178
[MRG] Added variable_name as a parameter for check_array#8178dalmia wants to merge 10 commits intoscikit-learn:masterfrom
Conversation
jnothman
left a comment
There was a problem hiding this comment.
Where variables are the result of computation, check_array is used to recast into a particular dtype or order, and/or to ensure finiteness. Perhaps having a variable name in these cases is a distraction in the code, but I'm okay with it.
Note that the alternative is to remove X from error messages (or make the fact that it's merely a placeholder clearer), and expect the user to read the traceback carefully. I personally think this is the right way to go, personally.
Arguably the same issue applies to column_or_1d.
|
|
||
| """ | ||
| covariance = check_array(covariance) | ||
| covariance = check_array(covariance, variable_name='covariance') |
There was a problem hiding this comment.
This value is not provided by the user, and it's a bit of a mystery to me that check_array is here at all
There was a problem hiding this comment.
Removing the variable_name argument from here for now.
sklearn/decomposition/nmf.py
Outdated
|
|
||
| def _check_init(A, shape, whom): | ||
| A = check_array(A) | ||
| A = check_array(A, variable_name='A') |
There was a problem hiding this comment.
the variable name is actually included in whom (see calls to the function, below). Replace whom with variable_name.
| _, _, X_offset, _, X_scale = _preprocess_data(X, y, fit_intercept, | ||
| normalize, | ||
| return_mean=True) | ||
| normalize, |
There was a problem hiding this comment.
We usually avoid fixing incidental things. It makes reviewing PRs harder.
| if check_input: | ||
| precompute = check_array(precompute, dtype=X.dtype.type, | ||
| order='C') | ||
| order='C', variable_name='precompute') |
There was a problem hiding this comment.
precompute is not user-provided, but this is okay.
sklearn/mixture/gaussian_mixture.py
Outdated
| """ | ||
| weights = check_array(weights, dtype=[np.float64, np.float32], | ||
| ensure_2d=False) | ||
| ensure_2d=False, variable_name='weights') |
There was a problem hiding this comment.
The user sees this variable as "weights_init"
sklearn/mixture/gaussian_mixture.py
Outdated
| """ | ||
| means = check_array(means, dtype=[np.float64, np.float32], ensure_2d=False) | ||
| means = check_array(means, dtype=[np.float64, np.float32], ensure_2d=False, | ||
| variable_name='means') |
sklearn/mixture/gaussian_mixture.py
Outdated
| ensure_2d=False, | ||
| allow_nd=covariance_type == 'full') | ||
| allow_nd=covariance_type == 'full', | ||
| variable_name='precisions') |
|
I agree on making the error message independent of the And maybe for |
|
Should we go ahead with this? @jnothman |
|
ping @jnothman |
|
|
||
| dictionary = check_array(dictionary.T, order='F', dtype=np.float64, | ||
| copy=False) | ||
| copy=False, variable_name='dictionary') |
There was a problem hiding this comment.
Remove variable_name here as it's not seen by the user.
sklearn/metrics/classification.py
Outdated
| # Check if dimensions are consistent. | ||
| transformed_labels = check_array(transformed_labels) | ||
| transformed_labels = check_array(transformed_labels, | ||
| variable_name='transformed_labels') |
There was a problem hiding this comment.
The user can't see this.
sklearn/utils/multiclass.py
Outdated
| if (label_type == "multilabel-indicator" and | ||
| len(set(check_array(y, ['csr', 'csc', 'coo']).shape[1] | ||
| len(set(check_array(y, ['csr', 'csc', 'coo'], | ||
| variable_name='y').shape[1] |
There was a problem hiding this comment.
The user doesn't see it as y.
a36c57b to
57bf914
Compare
|
I think that we shouldn't leave the responsibility of figuring out what the error is, to the user and that trying to come up with a general description of the error might not be the right thing to do. On the other hand, |
Codecov Report
@@ Coverage Diff @@
## master #8178 +/- ##
=======================================
Coverage 94.75% 94.75%
=======================================
Files 342 342
Lines 60813 60813
=======================================
Hits 57621 57621
Misses 3192 3192
Continue to review full report at Codecov.
|
|
Thanks for all the work, though I think the proposal in #8178 (comment) that we avoid mentioning the variable by name in error messages might be the better way to go. |
|
Approach of #8178 (comment) adopted in #9068 |
What does this implement/fix? Explain your changes.
This adds
variable_nameas a parameter forcheck_arrayto be used as a placeholder for error messages. Related files usingcheck_arrayon instances passed as a parameter have also been updated.Any other comments?
Discussed in #7996