CLUSTER ANALYSIS WITH WEIGHTED BINARY VARIABLES

M K Kamundi, J M Kihoro, S M Mwalili, B Kiula

Abstract


The objective of this study was to discover unique groupings/clusters resulting from performing clusteranalysis with weighted binary variables and with binary proximity measures. Cluster analysis techniques wereapplied to both the simulated binary data and also to the real/survey data that was initially collected tomeasure the ICT penetration among people in a certain county council in Kenya. For the survey data, only afew indicators (binary variables) were selected for this study. The clustering binary variables used were basedon ownership of a Mobile Phone, a Desktop, a Laptop and a Palmtop, for the simulated data; whereas for thesurvey data they were based on usage of the following: Mobile Data Processing, Mobile Internet, ComputerInternet, and Computer Data Processing. For both the simulated and the real/survey data, the names usedwere fictitious. Ten clusters were identified for the simulated unweighted binary data whereas for thesimulated weighted binary data, there were four clusters. Twelve clusters were identified for the real/surveyunweighted binary data whereas there were seven clusters for the real weighted binary data. Results of clusteranalyses for both the simulated binary data and the real/survey binary data revealed that when the binaryvariables were weighted very different and unique clusters were formed. Weighting of binary variables wasuseful in showing that some variables are more important than others and when cluster analysis wasperformed using the weighted binary variables, unique clusters were formed that portrayed the importance ofcertain variables.

References



Full Text: PDF