Breast cancer is a deadly disease observed among women worldwide. Disease detection and analysis is a significant part of data mining research. Classification as an essential data mining procedure also helps in the clinical diagnosis and analysis of this disease. In our study, we proposed a new method based on Neuro-fuzzy classification. We applied our method to three benchmark datasets from the UCI machine learning repository for breast cancer detection; these were Wisconsin Breast Cancer (WBC), Wisconsin Diagnostic Breast Cancer (WDBC), and Mammographic Mass (MM) datasets. Our goal was to diagnose and analyze breast cancer disease with the proposed method and, then, compare its performance with two well-known supervised classification algorithms Multilayer Perceptron and Support Vector Machine. We evaluated the performance of these classification methods in terms of different measures such as accuracy, Kappa statistic, true positive rate, false positive rate, precision, recall and F-measure. The proposed method had an accuracy of 99.4% with the WBC dataset, 97.7% with the WDBC dataset and 84.4% with the MM dataset; and in every respect, it performed better than Multilayer Perceptron and Support Vector Machine-based classification models. Data mining applications can be used in medical science and bioinformatics research field for the diagnosis of critical diseases [1, 2]. Aside from other life-ending illnesses, breast cancer has likely become an intensely focused topic [3] for cure discovery other than AIDS in the current decade. Breast cancer is a type of cancer that arises from cells in human breast tissue, usually from the lobules or inner lining...... middle of paper ......ral Information Processing Systems 9, USA: MIT Press , pp .162-168, 1997.[26] J. Scott Armstrong and Fred Collopy, “Error measures for generalizing about forecasting methods: Empirical comparisons.” International Journal of Forecasting, Vol. 8, pp. 69–80, 1992.[27] Jean Carletta, “Assessing agreement on classification tasks: the kappa statistic.” Computational Linguistics, MIT Press Cambridge, MA, USA, vol. 22, no. 2, pp. 249–254, 1996.[28] Stephen V. Stehman, “Selecting and Interpreting Thematic Classification Accuracy Measures.” Remote sensing of the environment, vol. 62, no. 1, pp. 77–89, 1997.[29] Wisconsin Breast Cancer Dataset (Original), UCI Machine Learning Archive, July 1992.[30] Wisconsin Breast Cancer (Diagnostic) Dataset, UCI Machine Learning Archive, November 1995.[31] Mass Mammography Dataset, UCI Machine Learning Archive, October, 2007.
tags