Classification Report - Precision and F-score are ill-defined

This is not an error, just a warning that not all your labels are included in your y_pred, i.e. there are some labels in your y_test that your classifier never predicts.

Here is a simple reproducible example:

from sklearn.metrics import precision_score, f1_score, classification_report

y_true = [0, 1, 2, 0, 1, 2] # 3-class problem
y_pred = [0, 0, 1, 0, 0, 1] # we never predict '2'

precision_score(y_true, y_pred, average='macro') 
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)
0.16666666666666666

precision_score(y_true, y_pred, average='micro') # no warning
0.3333333333333333

precision_score(y_true, y_pred, average=None) 
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)
array([0.5, 0. , 0. ])

Exact same warnings are produced for f1_score (not shown).

Practically this only warns you that in the classification_report, the respective values for labels with no predicted samples (here 2) will be set to 0:

print(classification_report(y_true, y_pred))


              precision    recall  f1-score   support

           0       0.50      1.00      0.67         2
           1       0.00      0.00      0.00         2
           2       0.00      0.00      0.00         2

   micro avg       0.33      0.33      0.33         6
   macro avg       0.17      0.33      0.22         6
weighted avg       0.17      0.33      0.22         6

[...] UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)

When I was not using np.array in the past it worked just fine

Highly doubtful, since in the example above I have used simple Python lists, and not Numpy arrays...


It means that some labels are only present in train data and some labels are only present in test dataset. Run the following codes, to understand the distribution of train and test labels.

from collections import Counter
Counter(y_train)
Counter(y_test)

Use stratified train_test_split to get rid of the situation where some labels are present only in test dataset.

It might have worked in past simply because of the random splitting of dataset. Hence, stratified splitting is always recommended.

The first situation is more about model fine tuning or choice of model.