Fix occurrence of ghost states with SVDs by fstein93 · Pull Request #3911 · cp2k/cp2k

fstein93 · 2025-01-28T17:59:02Z

Fixes #3823 . If the Cholesky decomposition is turned off for numerical reasons, CP2K approximates the inverse square roots of the overlap matrix using SVDs. This keeps the null-space during the solution of the eigenvalue problem such that additional eigenvectors with eigenvalue zero are found which are occupied if the Fermi-level $\mu\ge -kT$.
My PR removes the null-space entirely by working in the linear-independent subspace spanned by the eigenvectors outside of the null-space of the overlap matrix. I applied this scheme also to the level-shifting scheme and eigenvalue calculation for MP2. The reference values of the regtests were updated accordingly and are in better agreement with comparable regtests. I reintroduced the EPS_SVD keyword in the MP2 case.
@JWilhelm I tried Tikhonov regularization but found that it does not solve the problem in our case.

hfp · 2025-01-29T14:35:58Z

I am not yet fully sure about the cause, but with this PR, I am getting:

 RI-RPA section
 --------------

  Number of removed MOs:                                                      68
  Number of available MOs:                                                  1342

 *******************************************************************************
 *   ___                                                                       *
 *  /   \                                                                      *
 * [ABORT]                                                                     *
 *  \___/                         Diagonal has wrong size                      *
 *    |                                                                        *
 *  O/|                                                                        *
 * /| |                                                                        *
 * / \                                                 dbcsr_operations.F:2922 *
 *******************************************************************************


 ===== Routine Calling Stack =====

            5 dbcsr_get_diag
            4 compute_vec_Sigma_x_minus_vxc_gw
            3 mp2_main
            2 qs_energies
            1 CP2K

My test case is benchmarks/QS_low_scaling_GW/GW.inp with 16 ranks on a single node/system (PSMP).

fstein93 · 2025-01-29T14:44:05Z

I am currently checking it. It is possible. Apparently, DBCSR silently creates matrices of the wrong (?) size.

fstein93 · 2025-01-29T15:09:37Z

I think it is in here. The code implicitly assumes that the number of MOs is the same as the number of AOs. Before this change, this was actually true, whereas I changed it to the size of the linear independent subspace.
@JWilhelm Is it correct/reasonable to resize mo_coeff_b and matrix_tmp to nao x nmo and matrix_tmp_2 to nmo x nmo?

hfp · 2025-01-29T15:22:09Z

Another question is, if there is any test we can add which is showing the issue.

hfp · 2025-01-29T15:23:57Z

( Btw, good work! )

JWilhelm · 2025-01-29T17:07:19Z

I think it is in here. The code implicitly assumes that the number of MOs is the same as the number of AOs. Before this change, this was actually true, whereas I changed it to the size of the linear independent subspace. @JWilhelm Is it correct/reasonable to resize mo_coeff_b and matrix_tmp to nao x nmo and matrix_tmp_2 to nmo x nmo?

Yes, the code here assumes n_AO = n_MO. Agreed, it should work to resize mo_coeff_b and matrix_tmp as you suggested.

fstein93 · 2025-01-29T17:31:06Z

Apparently, it is easier to just get all diagonal elements and copy the first n_MO elements to the actual array. I checked the GW-corrected energies for a regtest-like setup and found the same values as before. But the sigma_C contribution changes. Is this behavior expected? How does the GW-code deal with ghost states? If you want to run some tests, checkout my branch.

hfp · 2025-01-30T08:05:15Z

I merged your branch and I can confirm it works for the mentioned case!

( As a side-note, either the fix or the SVD change itself seems to have caused a slight regression in performance [benchmarks/QS_low_scaling_GW/GW.inp]. Though, this is about generality/correctness rather than performance. )

fstein93 · 2025-01-30T10:12:58Z

I have opened the PR. I suppose that the differences in the correlation contributions to the self-energy are related to the falsely considered ghost states in the calculation.

hfp · 2025-01-30T19:11:03Z

To wrap up #3911 (comment), I tested current master against master with reverted commits 44d3f296a06ed44b6f3b263e9364c69f9d58d067, a24614b15a54e7cbb2bc67b5bd6f9c47c73e96e1, and cac185c219e31f8d4df52f216abe8cf7406e3b13 (in this order). I ran benchmarks/QS_low_scaling_GW/GW.inp found current master to be somewhat slower (~4%) with DBM consuming 40% more (device-)memory according to the recently introduced DBM memory statistics (see below).

 -------------------------------------------------------------------------------
 -                                                                             -
 -                                DBM STATISTICS                               -
 -                                                                             -
 -------------------------------------------------------------------------------
 [...]
 -------------------------------------------------------------------------------
 Memory consumption                            Number of allocations  Size [MiB]
 Device                                                          108        1627   <-- 98 and 1166 before
 Host                                                             34        1085
 -------------------------------------------------------------------------------

Slower run is according to tools/diff_cp2k.py:

compute_self_energy_cubic_gw
dbt_reshape
dbt_tas_reserve_blocks_index
dbm_reserve_blocks

Perhaps these function just point-out somewhat more work rather than having gotten slower.

fstein93 · 2025-01-31T08:27:59Z

That's interesting because I only shrinked the MO matrices a bit (maybe 5-10 %). In compute_self_energy_cubic_gw, dbcsr_get_diag is called with a larger array for the diagonal elements which will probably not affect the timings of DBM/DBT.

hfp · 2025-01-31T08:36:56Z

Yeah, it's hard to conclude something actionable (besides of some voodoo). I may take a look as we go forward.

Unrelated to this issue:
I noticed the get-diag routines in our linalg bits (not just dbcsr) are all about collecting diagonal elements into an array like allocated just for the sake of getting the elements. In this case, if the routine had an option to only take N elements you could take them into the desired destination right away. Anyway, I hope also DBM takes over more going forward.

Fixes cp2k#3823 . If the Cholesky decomposition is turned off for numerical reasons, CP2K approximates the inverse square roots of the overlap matrix using SVDs. This keeps the null-space during the solution of the eigenvalue problem such that additional eigenvectors with eigenvalue zero are found which are occupied if the Fermi-level μ ≥ − k T . My PR removes the null-space entirely by working in the linear-independent subspace spanned by the eigenvectors outside of the null-space of the overlap matrix. I applied this scheme also to the level-shifting scheme and eigenvalue calculation for MP2. The reference values of the regtests were updated accordingly and are in better agreement with comparable regtests. I reintroduced the EPS_SVD keyword in the MP2 case.

fstein93 added 3 commits January 28, 2025 15:48

Add alternative SVD scheme to reduce numerical noise

6e86dc4

Implement level shifting with SVD

fc0591d

Fix memory leaks

a555231

fstein93 merged commit cac185c into cp2k:master Jan 29, 2025
37 checks passed

fstein93 mentioned this pull request Jan 30, 2025

Fix GW with Cholesky off #3922

Merged

MGraml mentioned this pull request Feb 28, 2025

GW/BSE: Fix indices in case of removed null-spaces for SVDs #4035

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix occurrence of ghost states with SVDs#3911

Fix occurrence of ghost states with SVDs#3911
fstein93 merged 3 commits intocp2k:masterfrom
fstein93:Reduce_svd

fstein93 commented Jan 28, 2025

Uh oh!

Uh oh!

hfp commented Jan 29, 2025 •

edited

Loading

Uh oh!

fstein93 commented Jan 29, 2025

Uh oh!

fstein93 commented Jan 29, 2025

Uh oh!

hfp commented Jan 29, 2025

Uh oh!

hfp commented Jan 29, 2025

Uh oh!

JWilhelm commented Jan 29, 2025

Uh oh!

fstein93 commented Jan 29, 2025

Uh oh!

hfp commented Jan 30, 2025

Uh oh!

fstein93 commented Jan 30, 2025

Uh oh!

hfp commented Jan 30, 2025 •

edited

Loading

Uh oh!

fstein93 commented Jan 31, 2025

Uh oh!

hfp commented Jan 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

fstein93 commented Jan 28, 2025

Uh oh!

Uh oh!

hfp commented Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fstein93 commented Jan 29, 2025

Uh oh!

fstein93 commented Jan 29, 2025

Uh oh!

hfp commented Jan 29, 2025

Uh oh!

hfp commented Jan 29, 2025

Uh oh!

JWilhelm commented Jan 29, 2025

Uh oh!

fstein93 commented Jan 29, 2025

Uh oh!

hfp commented Jan 30, 2025

Uh oh!

fstein93 commented Jan 30, 2025

Uh oh!

hfp commented Jan 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fstein93 commented Jan 31, 2025

Uh oh!

hfp commented Jan 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hfp commented Jan 29, 2025 •

edited

Loading

hfp commented Jan 30, 2025 •

edited

Loading