A Comparative Study of Fuzzy C-Means and Kernel K-Means for Clustering Regencies and Cities in Indonesia Based on Food Demand
Abstract:
Indonesia's large population leads to high market demand for foods and food products. Because of this, Indonesia faces many challenges in ensuring a sufficient and high-quality food supply. Grouping 514 districts/cities in Indonesia based on food demand aims to effectively understand community needs and adapt marketing strategies to meet market demand. The variables used in this study are the amount of food consumed by districts/cities in Indonesia, which are divided into eleven food groups. The Fuzzy C-Means and Kernel K-Means algorithms were used to group regions based on their food demand. The selection of the optimal method and number of clusters was done by using the Silhouette Coefficient validity index. The optimal cluster is the cluster with the highest Silhouette Coefficient value, closest to 1. Through cluster validity testing using the Silhouette Coefficient, Fuzzy C-Means with three clusters was found to be the most optimal method, having the highest Silhouette Coefficient value, at 0.608436. This indicates that Fuzzy C-Means with three clusters has good cluster distribution. Cluster 1 indicates 88 districts/cities with the highest demand for fruit and other food ingredients. Cluster 2 indicates 224 districts/cities with the highest demand for eggs and milk, vegetables, fish, oil and coconut, beverage ingredients, spices, and processed foods. Cluster 3 indicates 202 districts/cities with the highest demand for meat and nuts.
KeyWords:
Market Demand; Food Ingredients; Cluster Analysis; Fuzzy C-Means; Kernel K-Means; Silhouette Coefficient
References:
- Achmal, E. F., Cholissodin, I., & Adikara, P. P. (2022). Segmentasi pelanggan menggunakan metode kernel K-means (studi kasus: Smartlegal.id). Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, 6(6), 2835–2843.
- Azahari, D. H. (2008). Membangun kemandirian pangan dalam rangka meningkatkan ketahanan nasional. Analisis Kebijakan Pertanian, 6(2), 174–195.
- Bora, D. J., & Gupta, D. A. K. (2014). A comparative study between fuzzy clustering algorithm and hard clustering algorithm. International Journal of Computer Trends and Technology, 10(2), 108–113
- Gustientiedina, G., Adiya, M. H., & Desnelita, Y. (2019). Penerapan algoritma K-means untuk clustering data obat-obatan. Jurnal Nasional Teknologi dan Sistem Informasi, 5(1), 17–24.
- López, O. A. M., López, A. M., & Crossa, J. (2022). Multivariate statistical machine learning methods for genomic prediction. Springer Nature.
- Maysaroh, S. (2015). Analisis kelompok dengan metode kernel K-means (studi kasus pengelompokan kabupaten/kota di Indonesia berdasarkan penduduk dengan faktor-faktor risiko penyebab penyakit hipertensi) (Tesis). Institut Teknologi Sepuluh Nopember.
- Rohmah, D. S., & Saputro, D. R. S. (2020). Clustering data dengan algoritme fuzzy C-means berbasis indeks validitas partition coefficient and exponential separation (PCAES). In PRISMA, Prosiding Seminar Nasional Matematika (Vol. 3, pp. 58–63).
- Santosa, B. (2007). Data mining terapan dengan Matlab. Graha Ilmu.
- Sanusi, W., Zaki, A., & Afni, B. N. (2019). Analisis fuzzy C-means dan penerapannya dalam pengelompokan kabupaten/kota di Provinsi Sulawesi Selatan berdasarkan faktor-faktor penyebab gizi buruk. Journal of Mathematics, Computations and Statistics, 2(1), 47–54.
- Struyf, A., Hubert, M., & Rousseuw, P. J. (1997). Integrating robust clustering techniques in S-PLUS. Computational Statistics & Data Analysis, 26(1), 17–37.
- Zakiah, S. (2022). Teori konsumsi dalam perspektif ekonomi Islam. El-Ecosy: Jurnal Ekonomi dan Keuangan Islam, 2(2), 180–194.