Research Article

An Efficient MapReduce-Based Parallel Clustering Algorithm for Distributed Traffic Subarea Division

Algorithm 3

Reduce(, medi).
Input:
: the index of the cluster,
 medi: the list of the local sums from different clusters.
Output: ,
: the index of the cluster,
: the new cluster center.
(1)Construct a counter Num to record the total number of samples belonging to the same cluster;
(2)Construct an array sum_v to record the sum of the values of different dimensions of the samples
 in the same cluster (i.e., the samples in the list medi);
(3)Construct the sample examples to extract the data objects from medi.next(), and the dimensions
 to obtain the dimension of the original data object;
(4) Num = 0;
(5)while (medi.hasNext()) do
(6)  CurrentPoint = medi.next();
(7)  Num+ = num_s;
(8)  for   to dimensions  do
(9)   sum_v[]+ = CurrentPoint.point[];
(10) end for
(11) for   to dimensions  do
(12)  mean[] = sum_v[]/Num;
(13) //Obtain the new cluster center
(14) end for
(15) end while
(16) index = ;
(17) Construct value 3 as a string composed of the new cluster center;
(18) return   pairs;