ROC-AUC 得分具有覆蓋和交叉驗證

需要預測的概率以便計算 ROC-AUC(曲線下面積)得分。cross_val_predict 使用 predict 分類器方法。為了能夠獲得 ROC-AUC 分數,可以簡單地對分類器進行子類化,覆蓋 predict 方法,這樣它就像 predict_proba 一樣。

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.cross_validation import cross_val_predict
from sklearn.metrics import roc_auc_score

class LogisticRegressionWrapper(LogisticRegression):
    def predict(self, X):
        return super(LogisticRegressionWrapper, self).predict_proba(X)

X, y = make_classification(n_samples = 1000, n_features=10, n_classes = 2, flip_y = 0.5)

log_reg_clf = LogisticRegressionWrapper(C=0.1, class_weight=None, dual=False,
             fit_intercept=True)

y_hat = cross_val_predict(log_reg_clf, X, y)[:,1]

print("ROC-AUC score: {}".format(roc_auc_score(y, y_hat)))

輸出:

ROC-AUC score: 0.724972396025