ROC-AUC 得分具有覆盖和交叉验证
需要预测的概率以便计算 ROC-AUC(曲线下面积)得分。cross_val_predict
使用 predict
分类器方法。为了能够获得 ROC-AUC 分数,可以简单地对分类器进行子类化,覆盖 predict
方法,这样它就像 predict_proba
一样。
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.cross_validation import cross_val_predict
from sklearn.metrics import roc_auc_score
class LogisticRegressionWrapper(LogisticRegression):
def predict(self, X):
return super(LogisticRegressionWrapper, self).predict_proba(X)
X, y = make_classification(n_samples = 1000, n_features=10, n_classes = 2, flip_y = 0.5)
log_reg_clf = LogisticRegressionWrapper(C=0.1, class_weight=None, dual=False,
fit_intercept=True)
y_hat = cross_val_predict(log_reg_clf, X, y)[:,1]
print("ROC-AUC score: {}".format(roc_auc_score(y, y_hat)))
输出:
ROC-AUC score: 0.724972396025