Majority Voting as Ensemble Classifier for Cervical Cancer Classification
Abstract
Cervical cancer is one of the deadliest female cancers. Early identification of cervical cancer through pap smear cell image evaluation is one of the strategies to reduce cervical cancer cases. The classification methods that are often used are SVM, MLP, and K-NN. The weakness of the SVM method is that it is not efficient on large datasets. Meanwhile, in the MLP method, large amounts of data can increase the complexity of each layer, thereby affecting the duration of the weighting process. Moreover, the K-NN method is not efficient for data with a large number of attributes. The ensemble method is one of the techniques to overcome the limitations of a single classification method. The ensemble classification method combines the performance of several classification methods. This study proposes an ensemble method with the majority voting that can be used in cervical cancer classification based on pap smear images in the Herlev dataset. Majority Voting is used to integrate test results from the SVM, MLP, and KNN methods by looking at the majority results on the test data classification. The results of this study indicate that the accuracy results obtained in the ensemble method increased by 1.72% compared to the average accuracy value in SVM, MLP, and KNN. for sensitivity results, the results of the ensemble method were able to increase the sensitivity increase by 0.74% compared to the average of the three single classification methods. for specificity, the ensemble method can increase the specificity results by 3.4%. From the results of the study, it can be concluded that the ensemble method with the most votes is able to improve the classification performance of the single classification method in classifying cervical cancer abnormalities with pap smear images.
References
Araque, O., I. Corcuera-Platas, J. F. Sánchez-Rada, and C. A. Iglesias (2017). Enhancing Deep Learning Sentiment Analysis with Ensemble Techniques in Social Applications. Expert Systems with Applications, 77; 236–246
Arbyn, M., E. Weiderpass, L. Bruni, S. de Sanjosé, M. Saraiya, J. Ferlay, and F. Bray (2020). Estimates of Incidence and Mortality of Cervical Cancer in 2018: A Worldwide Analysis. The Lancet Global Health, 8(2); 191–203
Asadi, F., C. Salehnasab, and L. Ajori (2020). Supervised Algorithms of Machine Learning for the Prediction of Cervical Cancer. Journal of Biomedical Physics & Engineering, 10(4); 513
Bora, K., M. Chowdhury, L. B. Mahanta, M. K. Kundu, and A. K. Das (2017). Automated Classification of Pap Smear Images to Detect Cervical Dysplasia. Computer Methods and Programs in Biomedicine, 138; 31–47
Campos, G. O., A. Zimek, J. Sander, R. J. Campello, B. Micenková, E. Schubert, I. Assent, and M. E. Houle (2016). On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study. Data Mining and Knowledge Discovery, 30(4); 891–927
Chandra, T. B., K. Verma, B. K. Singh, D. Jain, and S. S. Netam (2021). Coronavirus Disease (COVID-19) Detection in Chest X-Ray Images Using Majority Voting Based Classifier Ensemble. Expert Systems with Applications, 165; 113909
Chen, S., G. I. Webb, L. Liu, and X. Ma (2020). A Novel Selective Naïve Bayes Algorithm. Knowledge-Based Systems, 192; 105361
Demirtas, B. and I. Acikgoz (2013). Promoting Attendance at Cervical Cancer Screening: Understanding the Relationship with Turkish Womens’ Health Beliefs. Asian Pacific Journal of Cancer Prevention, 14(1); 333–340
Desiani, A., N. R. Dewi, A. N. Fauza, N. Rachmatullah, M. Arhami, and M. Nawawi (2021). Handling Missing Data Using Combination of Deletion Technique, Mean, Mode and Artificial Neural Network Imputation for Heart Disease Dataset. Science and Technology Indonesia, 6(4); 303–312
Desiani, A., B. Suprihatin, F. Efriliyanti, M. Arhami, and E. Setyaningsih (2022). VG-DropDNet a Robust Architecture for Blood Vessels Segmentation on Retinal Image. IEEE Access, 10; 92067-92083
Dey, N., A. Ashour, S. J. Fong, and S. Borra (2018). U-Healthcare Monitoring Systems: Volume 1: Design and Applications. Academic Press
Dietterich, T. G. (2000). Ensemble Methods in Machine Learning. In International Workshop on Multiple Classifier Systems. Springer; 1–15
Fekri-Ershad, S. (2019). Pap Smear Classification Using Combination of Global Significant Value, Texture Statistical Features and Time Series Features. Multimedia Tools and Applications, 78(22); 31121–31136
Ferrari, E. and D. Bacciu (2021). Addressing Fairness, Bias and Class Imbalance in Machine Learning: The FBI-Loss. arXiv Preprint arXiv:2105.06345; 1–23
Hemalatha, K. and D. Rani (2016). Improvement of Multilayer Perceptron Classification on Cervical Pap Smear Data with Feature Extraction. International Journal of Innovative Research in Science, Engineering, and Technology, 5(12); 20419–20424
Hussain, E., L. B. Mahanta, H. Borah, and C. R. Das (2020). Liquid Based-Cytology Pap Smear Dataset for Automated Multi-Class Diagnosis of Pre-Cancerous and Cervical Cancer Lesions. Data in Brief, 30; 105589
Iliyasu, A. M. and C. Fatichah (2017). A Quantum Hybrid PSO Combined with Fuzzy k-NN Approach to Feature Selection and Cell Classification in Cervical Cancer Detection. Sensors, 17(12); 2935
Jantzen, J., J. Norup, G. Dounias, and B. Bjerregaard (2005). Pap-Smear Benchmark Data for Pattern Classification. Nature inspired Smart Information Systems (NiSIS 2005); 1–9
Joshuva, A., R. S. Kumar, S. Sivakumar, G. Deenadayalan, and R. Vishnuvardhan (2020). An Insight on VMD for Diagnosing Wind Turbine Blade Faults Using C4. 5 as Feature Selection and Discriminating Through Multilayer Perceptron. Alexandria Engineering Journal, 59(5); 3863-3879
Karim, E. and N. Neehal (2019). An Empirical Study of Cervical Cancer Diagnosis Using Ensemble Methods. In 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT). IEEE; 1–5
Kataria, A. and M. Singh (2013). A Review of Data Classification Using k-nearest Neighbour Algorithm. International Journal of Emerging Technology and Advanced Engineering, 3(6); 354-360
Lawal, A. I. and M. A. Idris (2020). An Artificial Neural Network-Based Mathematical Model for the Prediction of Blast-Induced Ground Vibrations. International Journal of Environmental Studies, 77(2); 318–334
Lee, J. Y. and M. P. Styczynski (2018). NS-kNN: A Modified k-nearest Neighbors Approach for Imputing Metabolomics Data. Metabolomics, 14(12); 1–12
Pham, B. T., D. Tien Bui, H. R. Pourghasemi, P. Indra, and M. Dholakia (2017). Landslide Susceptibility Assesssment in the Uttarakhand Area (India) Using GIS: A Comparison Study of Prediction Capability of Naïve Bayes, Multilayer Perceptron Neural Networks, and Functional Trees Methods. Theoretical and Applied Climatology, 128(1); 255–273
Raza, K. (2019). Improving the Prediction Accuracy of Heart Disease with Ensemble Learning and Majority Voting Rule. In U-Healthcare Monitoring Systems. Elsevier; 179–196
Riana, D., Y. Ramdhani, R. T. Prasetio, and A. N. Hidayanto (2018). Improving Hierarchical Decision Approach for Single Image Classification of Pap Smear. International Journal of Electrical & Computer Engineering (2088-8708), 8(6); 5415–5424
Schohn, G. and D. Cohn (2000). Less is More: Active Learning with Support Vector Machines. In ICML, 2 Citeseer; 6
Sinaga, L. M. and S. Suwilo (2020). Analysis of Classification and Naïve Bayes Algorithm k nearest Neighbor in Data Mining. In IOP Conference Series: Materials Science and Engineering, 725. IOP Publishing; 012106
Taunk, K., S. De, S. Verma, and A. Swetapadma (2019). A Brief Review of Nearest Neighbor Algorithm for Learning and Classification. In 2019 International Conference on Intelligent Computing and Control Systems (ICCS). IEEE; 1255–1260
Wang, C., Y. Long, W. Li, W. Dai, S. Xie, Y. Liu, Y. Zhang, M. Liu, Y. Tian, and Q. Li (2020). Exploratory Study on Classification of Lung Cancer Subtypes Through a Combined k-nearest Neighbor Classifier in Breathomics. Scientific Reports, 10(1); 1–12
Wang, Z. and R. S. Srinivasan (2017). A Review of Artificial Intelligence Based Building Energy Use Prediction: Contrasting the Capabilities of Single and Ensemble Prediction Models. Renewable and Sustainable Energy Reviews, 75; 796–808
Win, K. P., Y. Kitjaidure, M. P. Paing, and K. Hamamoto (2019). Cervical Cancer Detection and Classification from Pap Smear Images. In Proceedings of the 2019 4th International Conference on Biomedical Imaging, Signal Processing; 47–54
Wu, W. and H. Zhou (2017). Data-Driven Diagnosis of Cervical Cancer with Support Vector Machine-Based Approaches. IEEE Access, 5; 25189–25195
Yadav, K. and R. Thareja (2019). Comparing the Performance of Naive Bayes and Decision Tree Classification Using R. International Journal of Intelligent Systems and Applications, 11(12); 11
Zhang, P., S. Shu, and M. Zhou (2018a). An Online Fault Detection Model and Strategies Based on SVM-Grid in Clouds. IEEE/CAA Journal of Automatica Sinica, 5(2); 445–456
Zhang, Z., T. Jiang, S. Li, and Y. Yang (2018b). Automated Feature Learning for Nonlinear Process Monitoring–An Approach Using Stacked Denoising Autoencoder and k-nearest Neighbor Rule. Journal of Process Control, 64; 49–61
Zuama, R. A. and I. A. Sobari (2020). Neural Network Optimization with Particle Swarm Optimization and Bagging Methods on Classification of Single Pap Smear Image Cells. Jurnal Pilar Nusa Mandiri, 16(1); 129–134
Authors
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.