Research Article
An Efficient MapReduce-Based Parallel Clustering Algorithm for Distributed Traffic Subarea Division
Input: | key: the offset, | value: the sample, | centroids: the global variable. | Output: , | : the index of the closest centroid, | : the information of sample. | (1) Construct a global variable centroids including the information of the closet centroid point; | (2) Construct the sample examples to extract the data objects from value; | (3) min_Dis = Double.MAX_VALUE; | (4) index = −1; | (5) for to centroids.length do | (6) distance = DistanceFunction(examples, centroids[]); | (7) if distance < min_Dis then | (8) min_Dis = distance; | (9) index = ; | (10) end if | (11) end for | (12) index = ; | (13) Construct value 1 as a string consisting of the values of different dimensions; | (14) return pairs; |
|