CLUSTER ANALYSIS WITH WEIGHTED BINARY VARIABLES

M K Kamundi, J M Kihoro, S M Mwalili, B Kiula

Abstract


The objective of this study was to discover unique groupings/clusters resulting from performing cluster analysis with weighted binary variables and with binary proximity measures. Cluster analysis techniques were applied to both the simulated binary data and also to the real/survey data that was initially collected to measure the ICT penetration among people in a certain county council in Kenya. For the survey data,  only a few indicators (binary variables) were selected for this study. The clustering binary variables usedwere based on ownership of a Mobile Phone, a Desktop, a Laptop and a Palmtop, for the simulated data; whereas for  the survey data they were based on usage of the following: Mobile Data Processing, Mobile Internet, Computer Internet, and Computer Data Processing. For both the simulated and the real/survey data, the names used were fictitious. Ten clusters were identified for the simulated unweighted binary data whereas for the simulated weighted binary data, there were four clusters. Twelve clusters were identified for the real/survey unweighted binary data whereas there were seven clusters for the real weighted binary data. Results of cluster analyses for both the simulated binary data and the real/survey binary data revealed that when the binary variables were weighted very different and unique clusters were formed. Weighting of binary variables was useful in showing that some variables are more important than others and when cluster analysis was performed using the weighted binary variables, unique clusters were formed that portrayed the importance of certain variables.

References



Full Text: PDF