We recently added fetch_openml(..., as_frame=True). We should allow the user to specify as_frame='auto' which would return X as an array if all attributes are numeric, and as a DataFrame if any are strings or nominal/categorical.
This should eventually be the default value. (Categoricals might not be handled appropriately by scikit-learn estimators, so this may be risky. Given the "experimental" nature of fetch_openml, the core devs will also have to decide if it's fine to break backwards compatibility with the current as_frame=False default....)
We recently added
fetch_openml(..., as_frame=True). We should allow the user to specifyas_frame='auto'which would return X as an array if all attributes are numeric, and as a DataFrame if any are strings or nominal/categorical.This should eventually be the default value. (Categoricals might not be handled appropriately by scikit-learn estimators, so this may be risky. Given the "experimental" nature of
fetch_openml, the core devs will also have to decide if it's fine to break backwards compatibility with the currentas_frame=Falsedefault....)