8.1 基础知识

8.1.1 偏差-方差权衡

记\(y=f(x)+\varepsilon, \; E(\varepsilon)=0\)，\(f\)表示真实模型，\(\hat f\)是模型某次训练得到的结果，\(E(\hat f)\)表示训练模型的期望表现。

\[ \begin{aligned} E[(\hat f-y)^2] &= E[(\hat f - E(\hat f) + E(\hat f)-y)^2] \\ &= E[(\hat f - E(\hat f))^2] + E[(E(\hat f)-y)^2] + 2E[(\hat f - E(\hat f))(E(\hat f)-y)] \\ &= E[(\hat f - E(\hat f))^2] + E[(E(\hat f)-y)^2] \\ &= E[(\hat f - E(\hat f))^2] + E[(E(\hat f)-f-\varepsilon)^2] \\ &= E[(\hat f - E(\hat f))^2] + E[(E(\hat f) - f)^2] + \varepsilon^2 \end{aligned} \]

故模型的期望泛化错误率可拆解为方差+偏差+噪声

8.1.2 评价指标

分类问题

准确率

\[ Accuracy = \frac{TP + TN}{TP + TN + FP +FN} \]

精确率（查准率）：有没有误报

\[ Precision = \frac{TP}{TP+FP} \]

召回率（查全率）：有没有漏报

\[ Recall = \frac{TP}{TP+FN} \]

F1与\(F_\beta\)

\[ F1 = \frac{2*Precision*Recall}{Precision + Recall} \\ F_\beta = \frac{(1+\beta^2)*Precision*Recall}{\beta^2*Precision + Recall} \]

\(0<\beta<1\)时精确率有更大影响，\(\beta>1\)时召回率有更大影响

ROC曲线与AUC：横轴假阳率FPR，纵轴真阳率TPR，全局性能评估

\[ TPR = \frac{TP}{TP+FN} \\ FPR = \frac{FP}{FP+TN} \]

PR曲线与AUC：横轴召回率，纵轴精确率，更关注正样本预测质量

当存在类别不平衡情况时，PR曲线相较ROC曲线更敏感，能捕捉到异常

代价曲线：引入误判代价
宏平均：对于多个混淆矩阵，先计算各个混淆矩阵的指标，再求平均
微平均：对于多个混淆矩阵，先平均各个混淆矩阵，再求指标

回归问题

均方误差：对异常值敏感
均方根误差：量纲与目标变量一致
平均绝对误差：对异常值不敏感
\(R^2\)与\(R^2_{adj}\)

其他

\[ AIC = -2L(\hat \theta)_{max} + 2k \]

k是参数数量

\[ BIC = -2L(\hat \theta)_{max}+ k\ln(n) \]