A Method for Identifying Japanese Shop and Company Names by Spatiotemporal Cleaning of Eccentrically Located Frequently Appearing Words
Table 13
Processing accuracy of removal of noise words (Data consists of 1000 samples extracted randomly from web data using the Hot Pepper API from within Tokyo prefecture).
Number of samples
1000
Is it necessary to remove noise words from names, as determined by a manual check?
Yes: 545
No: 455
Can we get the same result as manual processing using the FAW dictionary?
Yes: 67
No: 478
Can we get the same result as manual processing using the dictionary of geographic names and station names?
Yes: 237
No: 241
Can we get the same result as manual processing after LFAW removal?
Yes:81
No:160
Do pure names remain after all noise word removal processing?
Yes: 409
No: 46
Sum total
Number of data processed successfully
67
237
81
0
409
0
794
Processing accuracy (%)
79.40
“Hot Pepper” is a famous free coupon magazine in Japan, produced by Recruit Co., Ltd. Using the Hot Pepper API, we can collect information about many kinds of shops, companies, restaurants, and so forth.