[MRG] Adding LinearKernel to GaussianProcesses#8373
[MRG] Adding LinearKernel to GaussianProcesses#8373dalmia wants to merge 5 commits intoscikit-learn:mainfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8373 +/- ##
==========================================
+ Coverage 94.75% 94.75% +<.01%
==========================================
Files 342 342
Lines 60801 60835 +34
==========================================
+ Hits 57609 57642 +33
- Misses 3192 3193 +1
Continue to review full report at Codecov.
|
|
I am not totally convinced that we should add the functionality as a new kernel ( |
|
I agree. For the most part, this seems an extension to the |
|
@jmetzen Could you please explain as to why the gradient of |
|
This is a good question. It looks like a bug. But the unittest in test_kernels.py, which compares the implemented gradient with its numerical approximation does not fail. When I change the gradient to |
|
Yes, I observed the same thing when I was working on the |
|
Ah, I remember. It is because As an addendum: Why is theta the log-transformed value? From the docstring of the theta property: We should create a separate issue in which we can make the documentation and the comments more explicit about this. |
|
Thank you so much for the clear explanation. Finally, this makes sense now and I can proceed with adding the |
|
Just reporting, there is a bug in the tests. It assumes that: len(kernel.hyperparameters) == kernel.theta.sizeFor the >>>len(kernel.hyperparameters)
3
>>>kernel.theta.size
4 |
Codecov Report
@@ Coverage Diff @@
## master #8373 +/- ##
==========================================
+ Coverage 94.75% 94.75% +<.01%
==========================================
Files 342 342
Lines 60801 60835 +34
==========================================
+ Hits 57609 57642 +33
- Misses 3192 3193 +1
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #8373 +/- ##
==========================================
- Coverage 96.19% 94.75% -1.45%
==========================================
Files 348 342 -6
Lines 64645 60895 -3750
==========================================
- Hits 62187 57700 -4487
- Misses 2458 3195 +737
Continue to review full report at Codecov.
|
| _approx_fprime(kernel.theta, eval_kernel_for_theta, 1e-10) | ||
|
|
||
| assert_almost_equal(K_gradient, K_gradient_approx, 4) | ||
| assert_almost_equal(K_gradient, K_gradient_approx, 3) |
There was a problem hiding this comment.
Reduced the amount of precision. Is there an issue with that?
| for i, hyperparameter in enumerate(kernel.hyperparameters): | ||
| assert_equal(theta[i], | ||
| np.log(getattr(kernel, hyperparameter.name))) | ||
| i = 0 |
There was a problem hiding this comment.
The earlier implementation of the tests assume that:
len(kernel.hyperparameters) == kernel.theta.sizeThere was a problem hiding this comment.
Thanks that you fixed this problem. I was suprised that it was not an issue so far because the anisotropic RBF/Matern also violate the condition. But this is because the test skips the non-basic kernels at the beginning. Could you check if you can now also remove the lines 74-76:
if isinstance(kernel, KernelOperator) \
or isinstance(kernel, Exponentiation): # skip non-basic kernels
continue
There was a problem hiding this comment.
Yes, that's what prompted me to check further as to how did RBF do through and then encountered that they are being ignored. Nevertheless, removing the lines causes a fail:
AssertionError: Items in the first set but not the second:
'k1__constant_value'
'k2__length_scale'
"""Fail immediately, with the given message."""
>> raise self.failureException("Items in the first set but not the second:\n'k1__constant_value'\n'k2__length_scale'")I'll look into the issue and try to resolve it so that we can run the tests without ignoring any.
| assert_not_equal(id(kernel), id(kernel_cloned)) | ||
|
|
||
| # Check that all constructor parameters are equal. | ||
| assert_equal(kernel.get_params(), kernel_cloned.get_params()) |
There was a problem hiding this comment.
Cannot compare the two dictionaries when the value of c is an array.
|
Removing the lines can't be ignored: if isinstance(kernel, KernelOperator) \
or isinstance(kernel, Exponentiation): # skip non-basic kernels
continueThis is because, for example, for hyperparameter in self.k1.hyperparameters:
r.append(Hyperparameter("k1__" + hyperparameter.name,
hyperparameter.value_type,
hyperparameter.bounds,
hyperparameter.n_elements))Whereas it just adds the original return np.append(self.k1.theta, self.k2.theta) |
| return length_scale | ||
|
|
||
|
|
||
| def _check_offset(X, c): |
There was a problem hiding this comment.
This seems to be identical to _check_length_scale ? Essentially it is checking whether the kernel is anisotropic by checking length_scale or c in your case.
There was a problem hiding this comment.
would make sense to unify the two functions. there is no essential difference besides the string in the second ValueError
| self.c_bounds = c_bounds | ||
|
|
||
| @property | ||
| def non_uniform_offset(self): |
There was a problem hiding this comment.
Would it be better to name it as anisotropic as in RBF and Matern?
|
I agree that we should maybe add a hyperparameter |
| sigma_0 and c. The parameters of the Linear kernel are about specifying the | ||
| origin. The kernel is given by: | ||
|
|
||
| k(x_i, x_j) = sigma_b ^ 2 + sigma_v ^ 2 * ((x_i - c) \cdot (x_j - c)) |
There was a problem hiding this comment.
In any case, there is no need for sigma_b and sigma_v as hyperparameters. These can be achieved by doing something similar to ConstantKernel() +ConstantKernel() * LinearKernel()
|
So the gradient is with respect to |
|
The docstring was also misleading, sorry for that... |
|
This pull request introduces 1 alert - view on lgtm.com new alerts:
Comment posted by lgtm.com |
|
@dalmia @jmargeta @MechCoder do we want / need this or not? |
|
Hi @agramfort , I do not have sufficient expertise with GPs for a relevant opinion. But from looking at the issue and the PR, it indeed seems that extending the DotProduct kernel is probably enough. ...and you probably meant to mention the other Jan (@jmetzen) anyway :) |
|
I am closing this PR in favor of #8373, because that PR builds on top of this one and has more progress. Thank you for working on this PR. |
Reference Issue
Fixes #8359
What does this implement/fix? Explain your changes.
This adds
LinearKernelto GaussianProcesses as per the link mentioned in the issue thread.Any other comments?
sigma_vandsigma_bdiag(X)c