Research Article

An Efficient MapReduce-Based Parallel Clustering Algorithm for Distributed Traffic Subarea Division

Algorithm 1

Map(key, value).
Input:
 key: the offset,
 value: the sample,
 centroids: the global variable.
Output: ,
: the index of the closest centroid,
: the information of sample.
(1)Construct a global variable centroids including the information of the closet centroid point;
(2)Construct the sample examples to extract the data objects from value;
(3) min_Dis = Double.MAX_VALUE;
(4) index = −1;
(5)for   to centroids.length  do
(6)    distance = DistanceFunction(examples, centroids[]);
(7)    if distance < min_Dis  then
(8)    min_Dis = distance;
(9)    index = ;
(10)  end if
(11)  end for
(12) index = ;
(13) Construct value 1 as a string consisting of the values of different dimensions;
(14) return   pairs;