Skip to content

DOC use Ames housing for transformed_target example#16741

Merged
glemaitre merged 21 commits intoscikit-learn:masterfrom
lucyleeow:doc_trans_target
May 14, 2020
Merged

DOC use Ames housing for transformed_target example#16741
glemaitre merged 21 commits intoscikit-learn:masterfrom
lucyleeow:doc_trans_target

Conversation

@lucyleeow
Copy link
Copy Markdown
Member

Towards #16155

Use Ames housing data for plot_transformed_target.py.

Old plots:
image
image

New plots:
image

Hopefully n_quantiles I used is reasonable. Ames data has 1460 samples.

@lucyleeow
Copy link
Copy Markdown
Member Author

doc-min-dependencies is failing because the pad parameter of matplotlib.axes.Axes.set_title was introduced in matplotlib 2.2.0 whereas the min-dep env uses matplotlib 2.1.1

An alternative is just to use

ax1.text(s='Ridge regression \n with target transformation', x=-5e4, y=8e5, fontsize=12, multialignment='center')

instead. Though I note that the matplotlib recommended way to add a title to subplot is with matplotlib.axes.Axes.set_title.

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Mar 22, 2020

This looks good. Personally I think it's more common to have y_pred on the x axis and y_true on the y axis for the scatter plot.

Could you please add a residual plot?

  • y_pred - y_true on the y axis
  • y_pred on the x axis.

I expect the residual plot without the TargetTransform to be "reverse-smile"/banana shaped which is a bad sign. With the target quantile transform, the banana should go away which means that the new model has a better fit.

However one should observe heteroschedastic noise on the residual plots (larger residuals absolute values for larger y_pred) which means that the least square loss modeling assumption are not meant. This hints that a better model would expect the variance of the residuals to increase with the expected mean value (y_pred). This could probably be better modeled via a Tweedie loss with p in range [1, 2].

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Mar 22, 2020

Actually my second point on heteroschedastic noise is not that obvious with the Ames dataset. Maybe leave that analysis out. I would still love to see the residual plots :)

ax0.set_xlabel('True Target')
ax0.set_title('Ridge regression \n without target transformation')
ax0.text(1, 9, r'$R^2$=%.2f, MAE=%.2f' % (
ax0.set_title('Ridge regression \n without target transformation', pad=18)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pad keyword argument is causing the doc build to fail with older, yet supported versions of matplotlib.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see my comment: #16741 (comment)

I don't understand why people can't seem to see my comments on PRs - this is the second time this has happened! Do you think I changed some setting accidentally?

@lucyleeow
Copy link
Copy Markdown
Member Author

@ogrisel does this look okay?

image

@lucyleeow
Copy link
Copy Markdown
Member Author

whoops, wrong x axis!

image

@ogrisel

@lucyleeow
Copy link
Copy Markdown
Member Author

ping @ogrisel

@cmarmo
Copy link
Copy Markdown
Contributor

cmarmo commented May 5, 2020

Hi @lucyleeow rendering has some issues:

@lucyleeow
Copy link
Copy Markdown
Member Author

Thanks @cmarmo, I think i've fixed the plot problems!

@glemaitre
Copy link
Copy Markdown
Member

The banana went away, that's cool :)

@glemaitre glemaitre merged commit 78a213b into scikit-learn:master May 14, 2020
@glemaitre
Copy link
Copy Markdown
Member

Thanks @lucyleeow

gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020
viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
@lucyleeow lucyleeow deleted the doc_trans_target branch October 21, 2023 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants