MAINT: call model.fit directly if ndarrays passed to partial_fit#225
MAINT: call model.fit directly if ndarrays passed to partial_fit#225stsievert wants to merge 4 commits intodask:masterfrom
Conversation
|
I'm using |
TomAugspurger
left a comment
There was a problem hiding this comment.
I'll need to better digest these changes, and their motivation.
An alternative is to pass compute=False for ndarrays, extract the dask graph of the Delayed, and merge it with the caller's other stuff before doing compute.
| x = da.from_array(x, chunks=x.shape) | ||
| if isinstance(y, np.ndarray): | ||
| y = da.from_array(y, chunks=y.shape) | ||
| if isinstance(x, np.ndarray) and isinstance(y, np.ndarray): |
There was a problem hiding this comment.
It's possible that just X is passed, and y is None.
| if isinstance(y, np.ndarray): | ||
| y = da.from_array(y, chunks=y.shape) | ||
| if isinstance(x, np.ndarray) and isinstance(y, np.ndarray): | ||
| if hasattr(model, 'estimator'): |
There was a problem hiding this comment.
This seems fragile. Why not go through the model?
|
What tests do we want to pass? I see two cases, and these are reflected in the types passed to
The current implementation implements option 1 in #224. I'd rather be explicit about types and go with option 2. Then I think we should give a better error message if ndarrays are passed, and point to-be-implemented Thoughts? |
|
2 sounds better to me as well. |
|
#234 moves this to Incremental.partial_fit, which is what I was trying to do. |
If
partial_fitis called with NumPy arrays, why don't we callpartial_fitdirectly with those arrays instead of making a new dask array?I think this is the bug behind #221:
_partial_fitwould needlessly launch dask tasks.