A novel method is described for obtaining superior classification performance over a variable range of classification costs. By analysis of a set of existing classifiers using a receiver operating characteristic (ROC) curve, a set of new realisable classifiers may be obtained by a random combination of two of the existing classifiers. These classifiers lie on the convex hull that contains the original ROC points for the existing classifiers. This hull is the maximum realisable ROC (MRROC). A theorem for this method is derived and proved from an observation about ROC data, and experimental results verify that a superior classification system may be constructed using only the existing classifiers and the information of the original ROC data. This new system is shown to produce the MRROC, and as such provides a powerful technique for improving classification systems in problem domains within which classification costs may not be known a priori. Empirical results are presented for artificial data, and for two real world data sets: an image segmentation task and the diagnosis of abnormal thyroid condition.
Keywords:
Receiver Operating Characteristic, Classification, Maximum Realisable Receiver Operating Characteristic, Medical Diagnosis, Image Segmentation