Skip to content

fix: handling of missing values when dropping rows with outliers#101

Merged
lars-reimann merged 5 commits intomainfrom
7-drop_rows_with_outliers-currently-not-working
Mar 27, 2023
Merged

fix: handling of missing values when dropping rows with outliers#101
lars-reimann merged 5 commits intomainfrom
7-drop_rows_with_outliers-currently-not-working

Conversation

@lars-reimann
Copy link
Copy Markdown
Member

@lars-reimann lars-reimann commented Mar 27, 2023

Closes #7.

Summary of Changes

Previously, calling drop_rows_with_outliers on a Table that had at least one missing value in a numerical column cause the resulting table to be completely empty. This PR introduces two changes:

  1. Missing values are never considered outliers.
  2. Missing values are ignored when computing the standard deviation.

@lars-reimann lars-reimann requested a review from a team as a code owner March 27, 2023 16:39
@lars-reimann lars-reimann linked an issue Mar 27, 2023 that may be closed by this pull request
@lars-reimann
Copy link
Copy Markdown
Member Author

lars-reimann commented Mar 27, 2023

🦙 MegaLinter status: ✅ SUCCESS

Descriptor Linter Files Fixed Errors Elapsed time
✅ PYTHON black 2 0 0 0.9s
✅ PYTHON flake8 2 0 0.58s
✅ PYTHON isort 2 0 0 0.28s
✅ PYTHON mypy 2 0 2.51s
✅ PYTHON pylint 2 0 3.65s
✅ REPOSITORY git_diff yes no 0.03s

See detailed report in MegaLinter reports
Set VALIDATE_ALL_CODEBASE: true in mega-linter.yml to validate all sources, not only the diff

MegaLinter is graciously provided by OX Security

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 27, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (main@a0c56ad). Learn more about missing BASE report.
Report is 507 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #101   +/-   ##
=======================================
  Coverage        ?   92.04%           
=======================================
  Files           ?       36           
  Lines           ?     1219           
  Branches        ?        0           
=======================================
  Hits            ?     1122           
  Misses          ?       97           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lars-reimann lars-reimann merged commit 0a5e853 into main Mar 27, 2023
@lars-reimann lars-reimann deleted the 7-drop_rows_with_outliers-currently-not-working branch March 27, 2023 16:44
lars-reimann pushed a commit that referenced this pull request Mar 27, 2023
## [0.6.0](v0.5.0...v0.6.0) (2023-03-27)

### Features

* allow calling `correlation_heatmap` with non-numerical columns ([#92](#92)) ([b960214](b960214)), closes [#89](#89)
* function to drop columns with non-numerical values from `Table` ([#96](#96)) ([8f14d65](8f14d65)), closes [#13](#13)
* function to drop columns/rows with missing values ([#97](#97)) ([05d771c](05d771c)), closes [#10](#10)
* remove `list_columns_with_XY` methods from `Table` ([#100](#100)) ([a0c56ad](a0c56ad)), closes [#94](#94)
* rename `keep_columns` to `keep_only_columns` ([#99](#99)) ([de42169](de42169))
* rename `remove_outliers` to `drop_rows_with_outliers` ([#95](#95)) ([7bad2e3](7bad2e3)), closes [#93](#93)
* return new model when calling `fit` ([#91](#91)) ([165c97c](165c97c)), closes [#69](#69)

### Bug Fixes

* handling of missing values when dropping rows with outliers ([#101](#101)) ([0a5e853](0a5e853)), closes [#7](#7)
@lars-reimann
Copy link
Copy Markdown
Member Author

🎉 This PR is included in version 0.6.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@lars-reimann lars-reimann added the released Included in a release label Mar 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

released Included in a release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

drop_rows_with_outliers() currently not working

1 participant