Learning Lung Nodule Malignancy Likelihood from Radiologist Annotations or Diagnosis Data
Lung cancer is the world’s most lethal type of cancer, being crucial that an early diagnosis is made in order to achieve successful treatments. Computer-aided diagnosis can play an important role in lung nodule detection and on establishing the nodule malignancy likelihood. This paper is a contribution in the design of a learning approach, using computed tomography images. Our methodology involves the measurement of a set of features in the nodular image region, and train classifiers, as K-nearest neighbor or support vector machine (SVM), to compute the malignancy likelihood of lung nodules. For this purpose, the Lung Image Database Consortium and image database resource initiative database is used due to its size and nodule variability, as well as for being publicly available. For training we used both radiologist’s labels and annotations and diagnosis data, as biopsy, surgery and follow-up results. We obtained promising results, as an Area Under the Receiver operating characteristic curve value of 0.962 ± 0.005 and 0.905 ± 0.04 was achieved for the Radiologists’ data and for the Diagnosis data, respectively, using an SVM with an exponential kernel combined with a correlation-based feature selection method.