-
-
Notifications
You must be signed in to change notification settings - Fork 26.6k
MNT Replaced np.ndarray with memview where applicable in linear_model/_cd_fast.pyx
#23147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
np.ndarray with memview where applicable in linear_model/_cd_fast.pyxnp.ndarray with memview where applicable in linear_model/_cd_fast.pyx
jeremiedbb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to add the wrapper when in the calls of enet_coordinate_descent and enet_coordinate_descent_multitask since we currently have the same error when passing a dense read-only array. The test then can be extended to cover the dense case
| def make_read_only(obj, attr): | ||
| curr_attr = getattr(obj, attr) | ||
| RO_attr = create_memmap_backed_data(curr_attr) | ||
| setattr(obj, attr, RO_attr) | ||
|
|
||
| make_read_only(X, "data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you only make "data" read-only, I think the helper is no longer necessary and you can write it as flat code
For the dense case, don't we modify |
right, my bad. We do support read-only, but only if |
thomasjpfan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am overall +0.5 on this PR. I see ReadonlyArrayWrapper as a quick hack, so that we can use memoryviews in sklearn/_loss/_loss.pyx.tp. I prefer not to extend to use ReadonlyArrayWrapper too much in the library.
As for this PR, I think the tests are worth adding.
jjerphan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @Micky774.
I am also not a fan of ReadonlyArrayWrappers, but if this adds support for read-only datasets, this LTGTM.
To answer your question:
Would it be worth disabling boundscheck for any of these functions? Afaik there's not much of a risk of an out-of-bounds access in these functions.
#21512 centralised all the cython compiler directives. By default, all the checks are deactivated:
scikit-learn/sklearn/_build_utils/__init__.py
Lines 69 to 88 in 5f75acd
| # Additional checks for Cython | |
| cython_enable_debug_directives = ( | |
| os.environ.get("SKLEARN_ENABLE_DEBUG_CYTHON_DIRECTIVES", "0") != "0" | |
| ) | |
| config.ext_modules = cythonize( | |
| config.ext_modules, | |
| nthreads=n_jobs, | |
| compile_time_env={ | |
| "SKLEARN_OPENMP_PARALLELISM_ENABLED": sklearn._OPENMP_SUPPORTED | |
| }, | |
| compiler_directives={ | |
| "language_level": 3, | |
| "boundscheck": cython_enable_debug_directives, | |
| "wraparound": False, | |
| "initializedcheck": False, | |
| "nonecheck": False, | |
| "cdivision": True, | |
| }, | |
| ) |
…del/_cd_fast.pyx` (scikit-learn#23147) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…inear_model/_cd_fast.pyx` (scikit-learn#23147)" This reverts commit 3bb4bad.
…inear_model/_cd_fast.pyx` (scikit-learn#23147)" This reverts commit 3bb4bad.
Reference Issues/PRs
Addresses #10624
What does this implement/fix? Explain your changes.
Replaces
np.ndarraytyping with corresponding memviews insparse_enet_coordinate_descentandenet_coordinate_descent_multi_task.Any other comments?
Would it be worth disabling
boundscheckfor any of these functions? Afaik there's not much of a risk of an out-of-bounds access in these functions.