Skip to content

[MRG] Updates to SVM User Guide#16769

Merged
adrinjalali merged 23 commits intoscikit-learn:masterfrom
NicolasHug:svm_UG
Apr 6, 2020
Merged

[MRG] Updates to SVM User Guide#16769
adrinjalali merged 23 commits intoscikit-learn:masterfrom
NicolasHug:svm_UG

Conversation

@NicolasHug
Copy link
Copy Markdown
Member

@NicolasHug NicolasHug commented Mar 25, 2020

  • add references throughout the text
  • put examples in their respective sections
  • update docstring examples to use a StandardScaler
  • add detail about shrinking param. I don't know how to detail tol and max_iter though (seems really libsvm-specific).
  • add short descriptions to figures
  • add details in math section and reference the hinge and eps-insensitive losses
  • a few fixes / clarifications here and there

>>> rbf_svc.kernel
'rbf'

Parameters of the RBF Kernel
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved that up

@NicolasHug NicolasHug changed the title [WIP] Updates to SVM User Guide [MRG] Updates to SVM User Guide Mar 26, 2020
@NicolasHug
Copy link
Copy Markdown
Member Author

pinging the suggested reviewers @thomasjpfan @glemaitre @qinhanmin2014

Copy link
Copy Markdown
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @NicolasHug , love having better user guides :)

will try to review the rest tomorrow :)

Comment on lines -131 to -132
multi-class strategy, thus training n_class models. If there are only
two classes, only one model is trained::
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any particular reason to remove the sentence here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole section is about "multiclass classification"

Also the example below is an illustration of multiclass, not binary as the sentence suggests, so it's misleading

Comment on lines -361 to -362
:class:`LinearSVC` by the `liblinear`_ implementation is much more
efficient than its `libsvm`_-based :class:`SVC` counterpart and can
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't the liblinear and libsvm actually point to the right places?

generalization error of the classifier.

generalization error of the classifier. The figure below shows the decision
function for a separable problem, with three samples within the margin
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function for a separable problem, with three samples within the margin
function for a linearly separable problem, with three samples on the margin

maybe?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed these are on the margin but in general, i.e. when classes aren't separable, the SVs are "within" the margin boundary

Comment on lines +638 to +641
is the kernel. The terms :math:`\alpha_i` are called the dual coefficients.
This dual representation highlights the fact that training vectors are
implicitly mapped into a higher (maybe infinite)
dimensional space by the function :math:`\phi`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really tricky to explain why the dual highlights this fact. Hmm

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"tricky"

I see what you did there.

(all jokes aside yeah, we can't really explain that unless we explain the kernel trick. I'll add a link to https://en.wikipedia.org/wiki/Kernel_method which is reasonably good)

Comment on lines +615 to +659
While SVM models derived from `libsvm`_ and `liblinear`_ use ``C`` as
While SVM models derived from `libsvm` and `liblinear` use ``C`` as
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here with links.

Copy link
Copy Markdown
Member Author

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fast review!

Comment on lines -131 to -132
multi-class strategy, thus training n_class models. If there are only
two classes, only one model is trained::
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole section is about "multiclass classification"

Also the example below is an illustration of multiclass, not binary as the sentence suggests, so it's misleading

generalization error of the classifier.

generalization error of the classifier. The figure below shows the decision
function for a separable problem, with three samples within the margin
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed these are on the margin but in general, i.e. when classes aren't separable, the SVs are "within" the margin boundary

Comment on lines +638 to +641
is the kernel. The terms :math:`\alpha_i` are called the dual coefficients.
This dual representation highlights the fact that training vectors are
implicitly mapped into a higher (maybe infinite)
dimensional space by the function :math:`\phi`.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"tricky"

I see what you did there.

(all jokes aside yeah, we can't really explain that unless we explain the kernel trick. I'll add a link to https://en.wikipedia.org/wiki/Kernel_method which is reasonably good)

@NicolasHug NicolasHug added this to the 0.23 milestone Mar 31, 2020
Copy link
Copy Markdown
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the notation from rho to b and including scaling in the examples are great.

Minor comments, otherwise LGTM.

(maybe infinite) dimensional space by the function :math:`\phi`.

The decision function is:
The prediction is:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is the prediction is the sign of this, isn't it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's for regression (I did fix it for classification)

@adrinjalali adrinjalali merged commit e3e9137 into scikit-learn:master Apr 6, 2020
gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020
* WIP

* WIP

* Updates to the math section

* so apparently crammer-singer isn't consistent (http://www.cs.columbia.edu/\~rocco/papers/icml13.html)

* reorganized example links and described pictures

* Added references

* more refs

* Added scaler to docstring examples

* doc about dual coefs

* WIP

* note about shrinking and gram matrix

* ellipsis

* ellipsis again

* maybe fixed doc issue

* details for regression

* training error -> margin error

* Apply suggestions from code review

Co-Authored-By: Adrin Jalali <adrin.jalali@gmail.com>

* put back links + addressed rest of comments

* small detail about margin boundaries and SVs

* Addressed comments from Thomas and Adrin

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants