Skip to content

Problems with data scaling in sklearn.cross_decomposition.PLSRegression #6002

@ghost

Description

I am trying to fit some spectral data using PLS, and am having difficulties with the module.

Essentially, when I use the default value of scale=False, I get a prediction, BUT all my predictions are scaled, and I just cannot figure out how to convert back to my original data space. The same is true for the example code. As you see, the scaled prediction is just off. I assumed that I should be able to revert back to the "unscaled" data using pls_scaled.y_mean_ and pls_scaled.y_std_, but that doesn't seem to work for me. Any suggestions would be highly appreciated.

Y = np.array([[0.1, -0.2], [0.9, 1.1], [6.2, 5.9], [11.9, 12.3]])
X = np.array([[0., 0., 1.], [1.,0.,0.], [2.,2.,2.], [2.,5.,4.]])
pls_not_scaled=PLSRegression(n_components=2, scale=False)
pls_scaled=PLSRegression(n_components=2, scale=True)
pls_not_scaled.fit(X,Y)
Out[]: PLSRegression(copy=True, max_iter=500, n_components=2, scale=False, tol=1e-06)
pls_scaled.fit(X,Y)
Out[]: PLSRegression(copy=True, max_iter=500, n_components=2, scale=True, tol=1e-06)
Out[]: PLSRegression(copy=True, max_iter=500, n_components=2, scale=False, tol=1e-06)
Y_not_scaled_pred = pls_not_scaled.predict(X)
Y_scaled_pred = pls_scaled.predict(X)

print(Y)
[[  0.1  -0.2]
 [  0.9   1.1]
 [  6.2   5.9]
 [ 11.9  12.3]]

print(Y_not_scaled_pred)
[[  0.11029323  -0.09323388]
 [  0.86825958   0.77077371]
 [  6.24049125   6.31999396]
 [ 11.88095594  12.1024662 ]]

print(Y_scaled_pred)
[[ 1.52568016  1.47577156]
 [ 2.43367318  2.3584298 ]
 [ 6.25638942  6.26638408]
 [ 8.88425724  8.99941456]]

For those with a more visual bend:

plt.grid('on')
plt.scatter(Y[:, 0], Y_not_scaled_pred[:, 0], color='blue')
plt.scatter(Y[:, 0], Y_scaled_pred[:, 0], color='red')
plt.xlabel('Predicted')
plt.ylabel('Measured')
plt.legend(['not_scaled','scaled'],'lower right')

figure_1

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions