Skip to content

[SPARK-35510][PYTHON] Fix and reenable test_stats_on_non_numeric_columns_should_be_discarded_if_numeric_only_is_true#32690

Closed
HyukjinKwon wants to merge 2 commits intoapache:masterfrom
HyukjinKwon:SPARK-35510
Closed

[SPARK-35510][PYTHON] Fix and reenable test_stats_on_non_numeric_columns_should_be_discarded_if_numeric_only_is_true#32690
HyukjinKwon wants to merge 2 commits intoapache:masterfrom
HyukjinKwon:SPARK-35510

Conversation

@HyukjinKwon
Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

This PR proposes to fix and reenable test_stats_on_non_numeric_columns_should_be_discarded_if_numeric_only_is_true that was disabled when we upgrade Python 3.9 in CI at #32657.

Seems like this is because of the latest NumPy's behaviour change, see also https://github.com/numpy/numpy/pull/16273#discussion_r641264085.

pandas inherits this behaviour but it doesn't make sense when numeric_only is set to True in pandas. I will track and follow the status of the issue between pandas and NumPy.

For the time being, I propose to exclude boolean case alone in percentile/quartile test case

Why are the changes needed?

To keep the test coverage.

Does this PR introduce any user-facing change?

No, test-only.

How was this patch tested?

I roughly locally tested. But it should pass in CI.

@HyukjinKwon
Copy link
Copy Markdown
Member Author

cc @xinrong-databricks and @itholic too fyi

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 28, 2021

Test build #139046 has finished for PR 32690 at commit 22780dd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 28, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43566/

Copy link
Copy Markdown
Contributor

@itholic itholic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if tests pass

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 28, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43566/

Copy link
Copy Markdown
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Maybe create a JIRA and add to the code comment?

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 28, 2021

Test build #139053 has finished for PR 32690 at commit ad9b111.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Copy Markdown
Member Author

Merged to master.

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 28, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43574/

@SparkQA
Copy link
Copy Markdown

SparkQA commented May 28, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43574/

@HyukjinKwon HyukjinKwon deleted the SPARK-35510 branch January 4, 2022 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants