Clustering Indonesian Provinces Based on Sustainable Development Goals Welfare Indicators Using Partitioning Around Medoids

Author's Information:

Nabilla Aulia Jenny Dewi Rahmawati1, Yuciana Wilandari2, Diah Safitri*3

1,2,3 Department of Statistics, Faculty of Science and Mathematics, Universitas Diponegoro

Vol 03 No 05 (2026):Volume 03 Issue 05 May 2026

Page No.: 135-142

Abstract:

This study aims to classify interprovincial welfare disparities in Indonesia based on selected Sustainable Development Goals (SDGs) indicators and to identify groups of provinces with similar welfare characteristics. A quantitative approach employing cluster analysis was applied to secondary data from 38 provinces in 2024, using eight welfare-related SDG indicators covering food security, health, education, and access to basic services. These indicators were selected to represent key dimensions of human well-being and to reflect multidimensional aspects of regional development. Before analysis, the data were standardized using the z-score method to ensure comparability across indicators with different measurement scales and to prevent dominance by variables with larger variances. The Partitioning Around Medoids (PAM) algorithm was employed due to its robustness to outliers, as it determines cluster centers based on actual observations. The optimal number of clusters was identified using the Gap Statistic method. The results indicate that the optimal solution is achieved at , with a Gap Statistic value of , yielding a partition of provinces into two distinct clusters. Cluster 1 represents provinces with relatively better welfare conditions across the selected indicators. In contrast, Cluster 2 consists of provinces with lower welfare performance and higher vulnerability, with outliers effectively accommodated within the clustering structure. These findings highlight substantial disparities in welfare across provinces, underscoring the need for differentiated policy responses tailored to regional characteristics and development priorities. The results also provide policymakers with an empirical basis for designing more targeted,  evidence-based interventions to support the achievement of the SDGs at both national and subnational levels. Nevertheless, further research is recommended to incorporate additional indicators, explore alternative clustering techniques, and examine temporal dynamics to more comprehensively capture changes in welfare.

KeyWords:

Sustainable Development Goals, Welfare, Clustering, Partitioning Around Medoids, Gap Statistic

References:

  1. Badan Pusat Statistik. (2023). Angka anak tidak sekolah menurut jenjang pendidikan dan jenis kelamin, 2023. Jakarta: Badan Pusat Statistik. https://www.bps.go.id/id/statistics-table/2/MTk4NiMy/angka-anak-tidak-sekolah-menurut-jenjang-pendidikan-dan-jenis-kelamin.html.
  2. Badan Pusat Statistik. (2024a). Indikator SDGs kesejahteraan rakyat 2024. Jakarta: Badan Pusat Statistik. 
  3. Badan Pusat Statistik. (2024b). Persentase rumah tangga yang memiliki akses terhadap sumber air minum layak menurut provinsi (persen), 2024. Jakarta: Badan Pusat Statistik. https://www.bps.go.id/id/statistics-table/2/ODQ1IzI=/persentase-rumah-tangga-yang-memiliki-akses-terhadap-sumber-air-minum-layak-menurut-provinsi.html
  4. Dyaherawati, O., Martha, S., & Imro’ah, N. (2025). Penerapan algoritma K-medoids dengan optimasi gap statistics dalam pengelompokan daerah rawan kriminalitas di Indonesia. Buletin Ilmiah Matematika, Statistika dan Terapannya, 14(1),  103-112. 
  5. Fialine, A. P., Alodia, D. A., Endriani, D., & Widodo, E. (2021). Implementasi metode K-medoids clustering untuk pengelompokan provinsi di Indonesia berdasarkan indikator pendidikan. Journal of Mathematics Education and Applied, 2(2), 1-13. 
  6. Gujarati, D. N., & Porter, D. C. (2009). Basic econometrics (5th ed.). McGraw-Hill. 
  7. Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate data analysis (8th ed.). Cengage Learning EMEA. 
  8. Johnson, R. A., & Wichern, D. W. (2007). Applied multivariate statistical analysis (6th ed.). Pearson Prentice Hall. 
  9. Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. John Wiley & Sons. 
  10. Putra, A. A., Hasibuan, H. S., Tambunan, R. P., & Lautetu, L. M. (2024). Integration of the Sustainable Development Goals into a regional development plan in Indonesia. Sustainability, 16(23), 1-21. 
  11. Rindiawan, S. Z., Rachman, A. N., & Purwayoga, V. (2025). Optimasi jumlah cluster untuk analisis penjualan barang kosmetik menggunakan K-medoids. Jurnal Sistem dan Teknologi Informasi, 13(1), 148-165. 
  12. Simamora, B. (2005). Analisis multivariat pemasaran. Gramedia Pustaka Utama. 
  13. Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 411-423. 
  14. Tukey, J. W. (1977). Exploratory data analysis. Addison-Wesley. 
  15. Walpole, R. E. (1995). Pengantar statistika (Edisi ke-3). Gramedia Pustaka Utama. 
  16. Yamin, S., & Kurniawan, H. (2014). SPSS complete: Teknik analisis statistik terlengkap dengan software SPSS. Salemba Infotek.