Ramalan Skor Kredit Menggunakan Algoritma Pembelajaran Mesin

Khai Wah Khaw

Abstract


Skor kredit memainkan peranan yang amat penting dalam industri perbankan dan kewangan kerana ia digunakan sebagai alat utama untuk menilai tahap kebolehpercayaan kredit peminjam serta kebarangkalian kegagalan pembayaran balik pinjaman. Penilaian yang tepat membolehkan institusi kewangan mengurangkan risiko kewangan dan pada masa yang sama memaksimumkan pulangan. Seiring dengan pertumbuhan pesat industri kewangan dan peningkatan jumlah data berskala besar, pendekatan tradisional dalam penilaian skor kredit didapati semakin terhad dan kurang berkesan. Kajian ini menggunakan set data sebenar yang diperoleh daripada Kaggle, yang mengandungi 100,000 rekod pelanggan dengan pelbagai ciri demografi dan kewangan. Enam algoritma pembelajaran mesin digunakan, iaitu K-Jiran Terdekat  (KNN), Naif Bayes (NB) (NB), Mesin Vektor Sokongan (SVM), Pokok Keputusan (DT), Regresi Logistik  (LR), dan Hutan Rawak (RF). Proses kajian melibatkan prapemprosesan data, analisis data penerokaan, pembahagian data, pemodelan serta penilaian prestasi menggunakan metrik ketepatan, ketepatan ramalan, kepekaan, skor F1 dan ROC-AUC. Keputusan menunjukkan bahawa model Hutan Rawak mencapai prestasi terbaik dengan purata skor tertinggi berbanding model lain. Dapatan ini membuktikan potensi pembelajaran mesin sebagai alat sokongan keputusan yang berkesan dalam penilaian risiko kredit bagi industri kewangan.

Keywords


Pembelajaran Mesin; Skor Kredit; Hutan Rawak; Mesin Vektor Sokongan; Pokok Keputusan

Full Text:

PDF

References


A.Elsalamony, H. (2014). Bank direct marketing analysis of data mining techniques. International Journal of Computer Applications, 85(7), 12-22. doi:10.5120/14852-3218

Ampountolas, A., Nyarko Nde, T., Date, P., & Constantinescu, C. (2021). A machine learning approach for Micro-Credit scoring. Risks, 9(3), 50. doi:10.3390/risks9030050

Boz, Z., Gunnec, D., Birbil, S. I., & Öztürk, M. K. (2018). Reassessment and monitoring of loan applications with Machine Learning. Applied Artificial Intelligence, 32(9-10), 939-955. doi:10.1080/08839514.2018.1525517

Breheny, P. (n.d.). Kernel density classification. In Nonparametric Statistics. STA 621:.

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32. doi:10.1023/a:1010933404324

Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support Vector Machine Classification: Applications, challenges and Trends. Neurocomputing, 408, 189-215. doi:10.1016/j.neucom.2019.10.118

Charbuty, B., & Abdulazeez, A. (2021). Classification based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends, 2(01), 20-28. doi:10.38094/jastt20165

Cunningham, P., & Delany, S. J. (2022). K-nearest neighbour classifiers - a tutorial. ACM Computing Surveys, 54(6), 1-25. doi:10.1145/3459665

Huang, S., Cai, N., Pacheco, P. P., Narrandes, S., Wang, Y., & Xx, W. (2018, January 02). Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genomics & Proteomics, 15(1), 41-51. doi:10.21873/cgp.20063

Jadhav, S. D., & Channe, H. P. (2016, August). Efficient Recommendation System Using Decision Tree Classifier and Collaborative Filtering. International Research Journal of Engineering and Technology (IRJET), 03(08), 2016th ser., 2113-2118.

Koohang, A., Sargent, C. S., Nord, J. H., & Paliszkiewicz, J. (2022). Internet of things (IOT): From Awareness to continued use. International Journal of Information Management, 62, 102442. doi:10.1016/j.ijinfomgt.2021.102442

Kumar, A., Shanthi, D., & Bhattacharya, P. (2021). Credit Score Prediction System using deep learning and K-means algorithms. Journal of Physics: Conference Series, 1998(1), 012027. doi:10.1088/1742-6596/1998/1/012027

Kumar, M. S., Soundarya, V., Kavitha, S., Keerthika, E., & Aswini, E. (2019). Credit card fraud detection using random forest algorithm. 2019 3rd International Conference on Computing and Communications Technologies (ICCCT), 149-153. doi:10.1109/iccct2.2019.8824930

Kwon, Y., Shin, W., Ko, J., & Lee, J. (2020). AK-score: Accurate protein-ligand binding affinity prediction using an ensemble of 3D-Convolutional Neural Networks. International Journal of Molecular Sciences, 21(22), 8424. doi:10.3390/ijms21228424

Maleki, F., Ovens, K., Najafian, K., Forghani, B., Reinhold, C., & Forghani, R. (2020). Overview of Machine Learning Part 1. Neuroimaging Clinics of North America, 30(4). doi:10.1016/j.nic.2020.08.007

Mat Amin, M., Yep Ai Lan, J., Makhtar, M., & Rasid Mamat, A. (2018). A decision tree based recommender system for backpackers accommodations. International Journal of Engineering & Technology, 7(2.15), 45. doi:10.14419/ijet.v7i2.15.11210

Nasteski, V. (2017). An overview of the supervised machine learning methods. HORIZONS.B, 4, 51-62. doi:10.20544/horizons.b.04.1.17.p05

Park, H. (2013). An introduction to logistic regression: From basic concepts to interpretation with particular attention to nursing domain. Journal of Korean Academy of Nursing, 43(2), 154. doi:10.4040/jkan.2013.43.2.154

Patel, H. H., & Prajapati, P. (2018). Study and analysis of decision tree based classification algorithms. International Journal of Computer Sciences and Engineering, 6(10), 74-78. doi:10.26438/ijcse/v6i10.7478

Rudra Kumar, M., & Kumar Gunjan, V. (2020). Review of Machine Learning Models for credit scoring analysis. Ingeniería Solidaria, 16(1). doi:10.16925/2357-6014.2020.01.11

Sarker, I. H. (2021). Machine learning: Algorithms, real-world applications and Research Directions. SN Computer Science, 2(3). doi:10.1007/s42979-021-00592-x

Shipe, M. E., Deppen, S. A., Farjah, F., & Grogan, E. L. (2019). Developing prediction models for clinical use using logistic regression: An overview. Journal of Thoracic Disease, 11(S4). doi:10.21037/jtd.2019.01.25

Taunk, K., De, S., Verma, S., & Swetapadma, A. (2019). A brief review of nearest neighbor algorithm for learning and classification. 2019 International Conference on Intelligent Computing and Control Systems (ICCS). doi:10.1109/iccs45141.2019.9065747

Tong, S., & Koller, D. (2001, February 11). Support Vector Machine Active Learning with Applications to Text Classification. Journal of Machine Learning Research, 45-66.

Tsai, C., & Chen, M. (2010). Credit rating by Hybrid Machine Learning Techniques. Applied Soft Computing, 10(2), 374-380. doi:10.1016/j.asoc.2009.08.003

Uddin, S., Khan, A., Hossain, M. E., & Moni, M. A. (2019). Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making, 19(1). doi:10.1186/s12911-019-1004-8

Uddin, S., Khan, A., Hossain, M. E., & Moni, M. A. (2019). Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making, 19(1). doi:10.1186/s12911-019-1004-8

Vijayarani, S., & Dhayanand, S. (2015, April). Liver Disease Prediction using SVM and Naïve Bayes Algorithms. International Journal of Science, Engineering and Technology Research (IJSETR), 4(4), issn: 2278 – 7798, 816-820.

Wang, Y., Zhang, Y., Lu, Y., & Yu, X. (2020). A comparative assessment of Credit Risk Model based on machine learning ——a case study of Bank Loan Data. Procedia Computer Science, 174, 141-149. doi:10.1016/j.procs.2020.06.069

Weng, C., & Huang, C. (2021). A hybrid machine learning model for credit approval. Applied Artificial Intelligence, 35(15), 1439-1465. doi:10.1080/08839514.2021.1982475

Xie, Y., Li, X., Ngai, E., & Ying, W. (2009). Customer churn prediction using improved balanced random forests. Expert Systems with Applications, 36(3), 5445-5449. doi:10.1016/j.eswa.2008.06.121

Yin, H. (2019). Bank globalization and Financial Stability: International evidence. Research in International Business and Finance, 49, 207-224. doi:10.1016/j.ribaf.2019.03.009

Zhang, W., Wu, C., Zhong, H., Li, Y., & Wang, L. (2021). Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization. Geoscience Frontiers, 12(1), 469-477. doi:10.1016/j.gsf.2020.03.007

Zhu, C., Idemudia, C. U., & Feng, W. (2019). Improved logistic regression model for diabetes prediction by integrating PCA and K-means techniques. Informatics in Medicine Unlocked, 17, 100179. doi:10.1016/j.imu.2019.100179


Refbacks

  • There are currently no refbacks.


Copyright (c) 2026 Khai Wah Khaw

Flag Counter

Published by:

AIBPM Publisher

Editorial Office:

JL. Kahuripan No. 9 Hotel Sahid Montana, Malang, Indonesia
Phone: 
+62 341 366222
Email: admin.ssem@gmail.com
Website: https://ejournal.aibpmjournals.com/index.php/ssem

Supported by: Association of International Business & Professional Management

If you are interested to get the journal subscription you can contact us at admin.publisher@gmail.com

E-ISSN : 3032-324X

DOI: Prefix 10.32535 by CrossREF


INDEXED:

In Process

 

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.