-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add constant liar strategy to GPSampler
#6430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
11d8d5a
504cc51
3591b7d
2732ebe
70a7202
c15c0ee
16b91af
bad4d8a
0a74d89
4c1908a
bbb50ee
fede561
34fea9b
0a447fe
a7a54ab
a102fe2
4033fb9
5b372b3
6876cb4
e11187d
285d1c5
01a0686
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -103,6 +103,8 @@ def __init__( | |
| self._is_categorical = is_categorical | ||
| self._X_train = X_train | ||
| self._y_train = y_train | ||
| self._X_all = X_train | ||
| self._y_all = y_train | ||
| self._squared_X_diff = (X_train.unsqueeze(-2) - X_train.unsqueeze(-3)).square_() | ||
| if self._is_categorical.any(): | ||
| self._squared_X_diff[..., self._is_categorical] = ( | ||
|
|
@@ -146,6 +148,40 @@ def _cache_matrix(self) -> None: | |
| self.noise_var = self.noise_var.detach() | ||
| self.noise_var.grad = None | ||
|
|
||
| def append_running_data(self, X_running: torch.Tensor, y_running: torch.Tensor) -> None: | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can compute this more efficiently by reusing We can calculate this as follows: Let proof.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you so much for the precise explanation! I made the update accordingly.
not522 marked this conversation as resolved.
|
||
| assert self._cov_Y_Y_chol is not None and self._cov_Y_Y_inv_Y is not None, ( | ||
| "Call _cache_matrix before append_running_data" | ||
| ) | ||
| n_train = self._X_train.shape[0] | ||
| n_running = X_running.shape[0] | ||
| n_total = n_train + n_running | ||
|
|
||
| cov_Y_Y_chol = np.zeros((n_total, n_total), dtype=np.float64) | ||
| cov_Y_Y_chol[:n_train, :n_train] = self._cov_Y_Y_chol.numpy() | ||
| with torch.no_grad(): | ||
| kernel_running_train = self.kernel(X_running).detach().cpu().numpy() | ||
| kernel_running_running = self.kernel(X_running, X_running).detach().cpu().numpy() | ||
| kernel_running_running[np.diag_indices(n_running)] += self.noise_var.item() | ||
|
|
||
| cov_Y_Y_chol[n_train:, :n_train] = scipy.linalg.solve_triangular( | ||
| self._cov_Y_Y_chol.cpu().numpy(), kernel_running_train.T, lower=True | ||
| ).T | ||
| cov_Y_Y_chol[n_train:, n_train:] = np.linalg.cholesky( | ||
| kernel_running_running | ||
| - cov_Y_Y_chol[n_train:, :n_train] @ cov_Y_Y_chol[n_train:, :n_train].T | ||
| ) | ||
| self._y_all = torch.cat([self._y_train, y_running], dim=0) | ||
| cov_Y_Y_inv_Y = scipy.linalg.solve_triangular( | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can also compute here little bit more efficiently by reusing
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for the helpful suggestion! I agree that the forward substitution, What I was less certain about was whether the backward substitution could also be reduced to the same complexity. As you mentioned, I think this would be a good optimization to consider in a future improvement.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for the reply. Indeed, yes. We need |
||
| cov_Y_Y_chol.T, | ||
| scipy.linalg.solve_triangular(cov_Y_Y_chol, self._y_all.cpu().numpy(), lower=True), | ||
| lower=False, | ||
| ) | ||
|
|
||
| # NOTE(nabenabe): Here we use NumPy to guarantee the reproducibility from the past. | ||
| self._cov_Y_Y_chol = torch.from_numpy(cov_Y_Y_chol) | ||
| self._cov_Y_Y_inv_Y = torch.from_numpy(cov_Y_Y_inv_Y) | ||
| self._X_all = torch.cat([self._X_train, X_running], dim=0) | ||
|
|
||
| def kernel( | ||
| self, X1: torch.Tensor | None = None, X2: torch.Tensor | None = None | ||
| ) -> torch.Tensor: | ||
|
|
@@ -193,7 +229,7 @@ def posterior(self, x: torch.Tensor, joint: bool = False) -> tuple[torch.Tensor, | |
| ) | ||
| is_single_point = x.ndim == 1 | ||
| x_ = x if not is_single_point else x.unsqueeze(0) | ||
| mean = torch.linalg.vecdot(cov_fx_fX := self.kernel(x_), self._cov_Y_Y_inv_Y) | ||
| mean = torch.linalg.vecdot(cov_fx_fX := self.kernel(x_, self._X_all), self._cov_Y_Y_inv_Y) | ||
| # K @ inv(C) = V --> K = V @ C --> K = V @ L @ L.T | ||
| V = torch.linalg.solve_triangular( | ||
| self._cov_Y_Y_chol, | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add an inline comment about this design choice?
Namely, we could use mean (posterior mean), median, min, or any kind of operators, right?
Plus, as you may already know, we often integrate out the value because we can calculate the posterior distributions for each running trial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the comment. I have added a few inline comments. I also left some notes on issue #6392 regarding the future implementation plan.