In scikit-learn 0.14.1 it was possible to have y_true and y_pred lists of strings, and pass a list of strings as a labels argument to classification_report, and it worked as expected: only labels from this list were included to the report. This no longer works in scikit-learn master.
It was never documented that it should work: docs say that labels is an "Optional list of label indices to include in the report." So, according to docs, it was undefined what happens if y consists of strings and labels argument is passed - caller doesn't have correct indices to pass in this case.
It seems it is better to either raise an error if labels is passed when y is not pre-transformed by a LabelEncoder, or to restore and document 0.14.1 behavior. What do you think?