The numpy.cov() function is used to calculate the covariance matrix of one or more numerical variables. Covariance shows how two variables change together. A positive value means both variables increase together, a negative value means one increases while the other decreases, and zero means there is no linear relationship.
This example showing how to calculate the covariance of a single 2D array where each row represents a variable.
import numpy as np
x = np.array([[1, 2, 3], [4, 5, 6]])
print(np.cov(x))
Output
[[1. 1.] [1. 1.]]
Explanation:
- np.cov(x) calculates how the two variables change together
- Both variables increase at a similar rate, so the covariance is positive
- Diagonal values show how much each variable varies by itself (variance)
Syntax:
numpy.cov(m, y=None, rowvar=True, bias=False, ddof=None)
Parameters:
- m: Input data (1D or 2D array)
- y (Optional): second dataset
- rowvar: If True, rows are variables (default)
- bias: If True, normalizes using N instead of N-1
- ddof: Overrides normalization factor
Examples
Example 1: This example calculates the covariance matrix of a 2D array where each row is a variable and columns are observations.
import numpy as np
a = np.array([[0, 3, 4], [1, 2, 4], [3, 4, 5]])
print(np.cov(a))
Output
[[4.33333333 2.83333333 2. ] [2.83333333 2.33333333 1.5 ] [2. 1.5 1. ]]
Explanation:
- np.cov(a) computes covariance between rows
- Positive values indicate the variables increase together
- Larger covariance means a stronger relationship between variables
- Variance values on the diagonal show how spread out each variable is
Example 2: This example finds covariance between two separate lists by stacking them as rows.
import numpy as np
x = [1.2, 2.1, 3.3, 4.5]
y = [2.5, 2.9, 3.7, 3.9]
data = np.stack((x, y))
print(np.cov(data))
Output
[[2.0625 0.925 ] [0.925 0.43666667]]
Explanation:
- np.stack() combines both variables
- np.cov(data) calculates covariance between x and y
- As x increases, y also tends to increase this indicates a positive linear relationship between x and y
Example 3: This example shows how covariance changes when variables are arranged column-wise using rowvar=False.
import numpy as np
x = [1.2, 2.1, 3.3, 4.5]
y = [2.5, 2.9, 3.7, 3.9]
data = np.column_stack((x, y))
print(np.cov(data, rowvar=False))
Output
[[2.0625 0.925 ] [0.925 0.43666667]]
Explanation: rowvar=False tells NumPy that each column is a variable