where θ is a constant. By varying the threshold θ it is possible to “tune” the false positive and true positive rates generated by C. The ROC curve is a two dimensional curve. The coordinates (x, y) of the points on the ROC curve represent the false positive and true positive rate generated by C using different values of the threshold θ. The ROC curve represents a good way to visualize a classifier’s performance and helps in choosing a suitable operating point, i.e., a suitable value of the decision threshold θ for the classifier C for which the desired trade-off between F P and T P is attained. However, when comparing different classification algorithms it is often desirable to obtain a number, instead of a graph, as a measure of classification performance [17]. Therefore, the AUC is used as an estimate of classification performance. The highest the AUC, the better the classification performance of a classifier C. In particular, the AUC is an estimate of the probability P(µp(zp ) > µp(zn )), where zp ∈ ωp represents a generic positive pattern, and zn ∈ ωn represents a generic negative pattern. Therefore, the AUC is an estimate of the probability that the classifier scores the positive patterns higher than the negative ones [17].Many other methods for estimating and comparing the performance of classifiers exist. We suggest the reader to refer to [51, 29, 44, 57] for a more complete discussion and for the details on estimating classification accuracy and performing model selection.
đang được dịch, vui lòng đợi..
