DOC: Add Karl Pearson's reference to chi-square test by nightvision04 · Pull Request #13971 · scipy/scipy

nightvision04 · 2021-05-02T05:53:03Z

Pearson's original paper was important and I think it's an improvement to list it among chi-square test references.

Reference issue

What does this implement/fix?

Added Pearson's landmark paper to the chi-square references.

Pearson's original paper was important and I think it's an improvement to list it among chi-square test references.

tirthasheshpatel · 2021-05-02T06:19:22Z

I often work with chi-square tests and found it surprising that Pearson's paper wasn't found among the references. Is this something we can add?

The references section contains papers/articles which the author has referred to, to either write the code or the documentation. In this case, it seems like the author didn't refer to the paper you have mentioned to do either of those. If you think some part of the documentation or code can be improved using your reference, it would be nice to add that too. For example, you could add some documentation to explain the chi-squared test using your reference and cite the paper there.

nightvision04 · 2021-05-02T07:20:10Z

The references section contains papers/articles which the author has referred to, to either write the code or the documentation. In this case, it seems like the author didn't refer to the paper you have mentioned to do either of those. If you think some part of the documentation or code can be improved using your reference, it would be nice to add that too. For example, you could add some documentation to explain the chi-squared test using your reference and cite the paper there.

@tirthasheshpatel Great suggestion. I've included an explanation of when low frequencies warrant using Fisher's Exact test as higher power alternative. I've also added the minimum 13 count described in the original chi-square paper.

tirthasheshpatel · 2021-05-02T07:44:06Z

Also, you have proposed from your master branch. This is not a good practice. In the future, please propose changes from a feature branch. Thanks!

Co-authored-by: Tirth Patel <tirthasheshpatel@gmail.com>

nightvision04 · 2021-05-02T20:53:46Z

Looks like the build agent for Windows 64-bit is going exceptionally slow, causing one of the checks to fail. I do not have dev.azure.com credentials to re-run failed builds.

How would you like me to proceed?

tirthasheshpatel

LGTM now. Thanks, @nightvision04! (As this is a documentation change, I don't think the awaiting workflows are necessary before merging)

tupui

LGTM as well. I checked the documentation build so no need for extra CI. I am merging then. Thanks @nightvision04 for contributing! And thanks @tirthasheshpatel for the review.

* master: (164 commits) DOC: Add Karl Pearson's reference to chi-square test (scipy#13971) BLD: fix build warnings for causal/anticausal pointers in ndimage MAINT: stats: Fix unused imports and a few other issues related to imports. DOC: fix typo MAINT: Remove duplicate calculations in sokalmichener BUG: spatial: fix weight handling of `distance.sokalmichener`. DOC: update Readme (scipy#13910) MAINT: QMCEngine d input validation (scipy#13940) MAINT: forward port 1.6.3 relnotes REL: add PEP 621 (project metadata in pyproject.toml) support EHN: signal: make `get_window` supports `general_cosine` and `general_hamming` window functions. (scipy#13934) ENH/DOC: pydata sphinx theme polishing (scipy#13814) DOC/MAINT: Add copyright notice to qmc.primes_from_2_to (scipy#13927) BUG: DOC: signal: fix need argument config and add missing doc link for `signal.get_window` DOC: fix subsets docstring (scipy#13926) BUG: signal: fix get_window argument handling and add tests. (scipy#13879) ENH: stats: add 'alternative' parameter to ansari (scipy#13650) BUG: Reactivate conda environment in init MAINT: use dict built-in rather than OrderedDict Revert "CI: Add nightly release of NumPy in linux workflows (scipy#13876)" (scipy#13909) ...

josef-pkt · 2021-05-03T19:49:26Z

"If one or more frequencies
are less than 5, Fisher's Exact Test can be used with greater statistical
power."

I don't think that's true. Fisher's exact test in general is very conservative and doesn't have large power.
I think peason chisquare and similar tests would over reject, but I'm not completely sure about chisquare. (wald test in 2 by 2 strongly overrejects in small samples)

Besides, Fisher's exact is for 2 by 2, while pearson's chisquare test is the same for arbitrary number of components.

tirthasheshpatel · 2021-05-04T07:25:23Z

I don't think that's true. Fisher's exact test in general is very conservative and doesn't have large power.
I think peason chisquare and similar tests would over reject, but I'm not completely sure about chisquare. (wald test in 2 by 2 strongly overrejects in small samples)

I didn't verify the "with greater statistical power" part. Sorry! I think you are right here. It's better not to comment about it unless we are absolutely sure.

Besides, Fisher's exact is for 2 by 2, while pearson's chisquare test is the same for arbitrary number of components.

I agree but would you agree with reformulating the docs to say that the Fisher Exact test is more suited for small sample sizes. (I think that is more accepted and also present on the wikipedia page)

I can do a partial revert of this in a new PR if you agree. Otherwise, I can do a full revert of that statement. Thanks very much for verifying this! Feel free to share other thoughts that you may have.

josef-pkt · 2021-05-04T15:52:14Z

Fisher's exact test maintains size, that is type 1 error is below alpha, 0.05. (But because of the discrete sample space it can be far below the significance level)

Some references recommend tests that maintain size (rejection rate approximately equal to alpha) on average instead of always. That makes the test less conservative, but overrejects in some cases.

I think chisquare test becomes liberal similarly to wald test in small samples.
But, I don't have a reference for the small sample performance of pearson's chisquare test. My readings were mostly on hypothesis tests for one sample and two sample proportions.

I agree but would you agree with reformulating the docs to say that the Fisher Exact test is more suited for small sample sizes.

It's not clear whether a very conservative test is very useful either, although still better than very liberal.

Maybe make the statement a bit broader and refer to "exact tests, such as Fisher's exact test" are recommended in small samples because they do not overreject.

Barnard's test is an unconditional exact test, and there are a few alternatives that are approximately exact and those could be preferred to Fisher's exact test because they are less conservative in small samples.

nightvision04 · 2021-05-04T18:14:48Z

Maybe make the statement a bit broader and refer to "exact tests, such as Fisher's exact test" are recommended in small samples because they do not overreject.

Thank you for this correction. I could have worded it softer and included this motivation.

tirthasheshpatel · 2021-05-05T04:08:06Z

Maybe make the statement a bit broader and refer to "exact tests, such as Fisher's exact test" are recommended in small samples because they do not overreject.

Barnard's test is an unconditional exact test, and there are a few alternatives that are approximately exact and those could be preferred to Fisher's exact test because they are less conservative in small samples.

Makes sense. Thanks! I will rephrase this in a new PR unless @nightvision04 wants to take that up.

I could have worded it softer and included this motivation.

If you want, you could submit another PR rephrasing the statement as @josef-pkt said. Would you be willing to do that?

nightvision04 · 2021-05-05T16:49:36Z

Sure can @tirthasheshpatel . I'll have it up in the next few days.

nightvision04 added 2 commits May 1, 2021 23:46

Added Karl Pearson's reference to chi-square test

a839cd1

Pearson's original paper was important and I think it's an improvement to list it among chi-square test references.

Updated reference

1daff97

nightvision04 changed the title ~~Added Karl Pearson's reference to chi-square test~~ ENH: Added Karl Pearson's reference to chi-square test May 2, 2021

nightvision04 changed the title ~~ENH: Added Karl Pearson's reference to chi-square test~~ ENH: Add Karl Pearson's reference to chi-square test May 2, 2021

nightvision04 changed the title ~~ENH: Add Karl Pearson's reference to chi-square test~~ DOC: Add Karl Pearson's reference to chi-square test May 2, 2021

Add test alternative if frequencies are low

d8a0b94

Replace en-dash with em-dash in reference (utf-8)

22d459e

tirthasheshpatel reviewed May 2, 2021

View reviewed changes

Comment thread scipy/stats/stats.py Outdated

tirthasheshpatel reviewed May 2, 2021

View reviewed changes

Comment thread scipy/stats/stats.py Outdated

tylerjereddy added the Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org label May 2, 2021

nightvision04 and others added 2 commits May 2, 2021 12:45

Update stats.py to recommend n >13

0139387

Update scipy/stats/stats.py

fdfd0c1

Co-authored-by: Tirth Patel <tirthasheshpatel@gmail.com>

Adjusted line return.

d0f0694

tirthasheshpatel approved these changes May 3, 2021

View reviewed changes

tirthasheshpatel added the scipy.stats label May 3, 2021

ilayn added this to the 1.7.0 milestone May 3, 2021

tupui approved these changes May 3, 2021

View reviewed changes

tupui merged commit 6e5c9ee into scipy:master May 3, 2021

nightvision04 mentioned this pull request May 9, 2021

DOC: Broaden Exact Test Reference #14012

Merged

Uh oh!

Conversation

nightvision04 commented May 2, 2021

Reference issue

What does this implement/fix?

Uh oh!

tirthasheshpatel commented May 2, 2021

Uh oh!

nightvision04 commented May 2, 2021

Uh oh!

Uh oh!

tirthasheshpatel commented May 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

nightvision04 commented May 2, 2021

Uh oh!

tirthasheshpatel left a comment

Choose a reason for hiding this comment

Uh oh!

tupui left a comment

Choose a reason for hiding this comment

Uh oh!

josef-pkt commented May 3, 2021

Uh oh!

tirthasheshpatel commented May 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

josef-pkt commented May 4, 2021

Uh oh!

nightvision04 commented May 4, 2021

Uh oh!

tirthasheshpatel commented May 5, 2021

Uh oh!

nightvision04 commented May 5, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

tirthasheshpatel commented May 2, 2021 •

edited

Loading

tirthasheshpatel commented May 4, 2021 •

edited

Loading