-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Improvement on Permutation importance example in release highlights #17313
Copy link
Copy link
Closed
Labels
Description
Describe the issue linked to the documentation
when I look at the example given here, I got confused why the feature names are not sorted with respect to importance.
Suggest a potential alternative/fix
X, y = make_classification(random_state=0, n_features=5,
n_informative=3)
rf = RandomForestClassifier(random_state=0).fit(X, y)
result = permutation_importance(rf, X, y, n_repeats=10, random_state=0,
n_jobs=-1)
feature_names = np.array([f'x_{i}' for i in range(X.shape[1])])
fig, ax = plt.subplots()
sorted_idx = result.importances_mean.argsort()
ax.boxplot(result.importances[sorted_idx].T,
vert=False, labels=feature_names[sorted_idx])
ax.set_title("Permutation Importance of each feature")
ax.set_ylabel("Features")
fig.tight_layout()
plt.show()
Also, for clarity may be we can set n_redundant=0, hence emphasising that permutation_importance identifies the 3 informative features precisely.
X, y = make_classification(random_state=0, n_features=5,
n_informative=3, n_redundant=0)
rf = RandomForestClassifier(random_state=0).fit(X, y)
result = permutation_importance(rf, X, y, n_repeats=10, random_state=0,
n_jobs=-1)
feature_names = np.array([f'x_{i}' for i in range(X.shape[1])])
fig, ax = plt.subplots()
sorted_idx = result.importances_mean.argsort()
ax.boxplot(result.importances[sorted_idx].T,
vert=False, labels=feature_names[sorted_idx])
ax.set_title("Permutation Importance of each feature")
ax.set_ylabel("Features")
fig.tight_layout()
plt.show()
Reactions are currently unavailable

