[MRG] Add plotting module with heatmaps for confusion matrix and grid search results#9173
[MRG] Add plotting module with heatmaps for confusion matrix and grid search results#9173thismlguy wants to merge 102 commits intoscikit-learn:mainfrom
Conversation
|
can you fix the flake8 issues? |
doc/modules/classes.rst
Outdated
| :no-inherited-members: | ||
|
|
||
| This module is experimental. Use at your own risk. | ||
| Use of this module requires the matplotlib library. |
There was a problem hiding this comment.
Modified to: "Use of this module requires the matplotlib library,
version 1.5 or later (preferably 2.0)."
| :toctree: generated/ | ||
| :template: function.rst | ||
|
|
||
| plot_heatmap |
There was a problem hiding this comment.
added. this was WIP in previous PR.
There was a problem hiding this comment.
I wouldn't count on users to read this message. My two cents is that if you really want users to notice that the plotting module is experimental, you'd have to put it in a sub module "experimental" or "future", and only move it to the main namespace when the API is stable.
There was a problem hiding this comment.
It's a bit unclear to me what it would mean for the API to be stable, and I really don't like forcing people to change their code later. I would probably just remove the warning here and then do standard deprecation cycles.
There was a problem hiding this comment.
Doing deprecation cycles is forcing people to change their code eventually, with the additional risk that they won't know that this code is experimental :)
There was a problem hiding this comment.
Deprecation cycles only sometimes require users to change their code - if they are actually using the feature you're deprecating. That's not very common for most deprecations in scikit-learn. And that is only if there is actually a change.
And I'm not sure what's experimental about this code. The experiment is more having plotting inside scikit-learn. Since it's plotting and therefor user facing, I'd rather have a warning on every call then putting it in a different module.
I guess the thing we are trying to communicate is "don't build long-term projects relying on the presence of plotting in scikit-learn because we might remove it again".
| generate the heatmap. | ||
| """ | ||
|
|
||
| import matplotlib.pyplot as plt |
There was a problem hiding this comment.
not sure if we want actual tests of the functionality (I'm leaning no), but I think we want at least smoke-tests.
There was a problem hiding this comment.
added smoke-tests which just run the code without asserts.
There was a problem hiding this comment.
hm looks like the config we use for coverage doesn't have matplotlib. We should change that...
sklearn/plot/_confusion_matrix.py
Outdated
| if normalize: | ||
| values = values.astype('float') / values.sum(axis=1)[:, np.newaxis] | ||
|
|
||
| print(title) |
sklearn/plot/_gridsearch_results.py
Outdated
| from sklearn.plot import plot_heatmap | ||
|
|
||
|
|
||
| def plot_gridsearch_results(cv_results, param_grid, metric='mean_test_score', |
There was a problem hiding this comment.
This only works for 2d grid-searches, right? I think we should also support 1d, and error for anything else.
| except ImportError: | ||
| raise SkipTest("Not testing plot_heatmap, matplotlib not installed.") | ||
|
|
||
| import matplotlib.pyplot as plt |
There was a problem hiding this comment.
hm it looks like matplotlib is not installed for the service that computes coverage? hm...
There was a problem hiding this comment.
yea this seems to be a problem because my tests cover much more than what Codecov is showing. i wasn't able to check coverage locally as I'm using a mac and running matplotlib from terminal is giving errors which I'm not able to resolve yet.
There was a problem hiding this comment.
nope these are tricky to resolve. it requires me to re-install python using a different method and then i'll have to setup my scikit-learn development environment again. since sklearn doesn't test a lot on matplotlib, i'm just using a jupyter-notebook for this PR.
There was a problem hiding this comment.
We should discuss this in person.
Codecov Report
@@ Coverage Diff @@
## master #9173 +/- ##
==========================================
- Coverage 96.3% 96.13% -0.18%
==========================================
Files 332 337 +5
Lines 60549 60754 +205
==========================================
+ Hits 58314 58403 +89
- Misses 2235 2351 +116
Continue to review full report at Codecov.
|
|
I think we should also give the user an option to plot everything as a single line graph. This will allow them to get a good plot results in case number of unique parameters are 3 or more but the total number of cases are within say 20. Adding x-axis labels will be a challenge. But we can give each combo an ID and then print a table below showing the value of each parameter for each ID on the plot. Thoughts? |
|
Also, could you help me understand the circleCI error (details here): Exception occurred: ./build_tools/circle/build_doc.sh returned exit code 2 Action failed: ./build_tools/circle/build_doc.sh |
|
@aarshayj can you merge master? This is an issue with old sphinx, I think, which should be fixed in master. |
…cikit-learn into plot_confusion_matrix
|
We now have a display for confusion matrix. |
Continues PR 8082
Changes made: