Conversation
|
Oh cool. I'm curious to see what comes here. My sense (correct me if I'm wrong) is that we probably don't want to annotate every test in order to support GPUs. I'm thinking that this is just a temporary thing as you're playing around. If not then let's chat. |
Right, I'd rather not annotate everything, and I'm definitely still exploring here. It isn't difficult to parameterize the At the moment, I'm thinking that I should just configure @pytest.fixture(
params=[
"pandas",
pytest.param("cudf", marks=pytest.mark.skipif(cudf is None, reason="cudf not found.")),
]
)
def backend(request):
yield request.paramI'm thinking this would allow us to skip (maybe |
|
Update:
|
| # Import backend DataFrame library to test | ||
| BACKEND = config.get("dataframe.backend", "pandas") | ||
| lib = importlib.import_module(BACKEND) |
There was a problem hiding this comment.
@phofl - Any opinions on using this approach as a "switch" to test the cudf backend?
There was a problem hiding this comment.
Certainly interested in your thoughts as well @mrocklin
There was a problem hiding this comment.
I think this is ok, although it will take me some time to get used to it😂
traveling today, will take a closer look at the whole pr when I am back
| if self._required_attribute: | ||
| dep = next(iter(self.dependencies()))._meta | ||
| if not hasattr(dep, self._required_attribute): | ||
| # Raise a ValueError instead of AttributeError to | ||
| # avoid infinite recursion | ||
| raise ValueError(f"{dep} has no attribute {self._required_attribute}") | ||
|
|
||
| @property | ||
| def _required_attribute(self) -> str: | ||
| # Specify if the first `dependency` must support | ||
| # a specific attribute for valid behavior. | ||
| return None |
There was a problem hiding this comment.
@phofl - I ran into quite a few cases where the cudf backend was hanging (rather than quickly failing) for cases where a cudf.DataFrame/Series did not support the same attribute as pd.DataFrame/Series (e.g. nbytes, align, etc).
What do you think about doing something like this so that we can more-quickly detect that the dependency is missing a necessary attribute?
|
thx! |
|
woohoo |
DASK_DATAFRAME__BACKENDenvironment variable (with default of"pandas") to set the backend library intest_io.pyandtest_collection.py. This approach can be expanded incrementally throughout the test suite without changing much code. I added skip/xfail statements for cudf in a few places, but we don't need to do this for all tests that fail with cudf (it is valuable to be able to "run" tests with a cudf backend either way).