Bug in LedoitWolf Shrinkage

The estimate of the shrinkage in the Ledoit is pretty broken:

<pre>
import numpy as np
from sklearn import covariance
np.random.seed(42)
signals = np.random.random(size=(75, 4))
print(covariance.ledoit_wolf(signals))
</pre>


This outputs:

<pre>
(array([[ 0.08626827,  0.        , -0.        , -0.        ],
       [ 0.        ,  0.08626827,  0.        ,  0.        ],
       [-0.        ,  0.        ,  0.08626827, -0.        ],
       [-0.        ,  0.        , -0.        ,  0.08626827]]), 1.0)
</pre>


In other words, the estimator has deduced that their should be a shrinkage of 1: it's taking something proportional to the identity.

That shrinkage is given by "m_n" in lemma 3.2 of "A well-conditioned estimator for large-dimensional covariance matrices", Olivier Ledoit and Michael Wolf: "m_n = <S_n, I_n>" where "<.,.>" is the canonical matrix inner product, I_n is the identity, and S_n the data scatter matrix. As can be seen from this equation, m_n == 1 is possible only if the scatter matrix is 1. Hence this result is false. Not that I believed it at all.

I know where the bug is (n_splits == 0). I just need to find a robust test so that these things don't happen again.

This is quite bad: we have had a broken Ledoit Wolf for a few releases :(. Ledoit Wolf is the most useful covariance estimator.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug in LedoitWolf Shrinkage #6195

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Bug in LedoitWolf Shrinkage #6195

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions