|
Dataset | Validation scheme | Method | Accuracy |
|
IAVID-1 | Splits (70-30) | Proposed technique | 81.43% |
C3D features with SVM classifier[17] | 48.77% |
C3D features with CNN[17] | 40.0% |
HOG representation of MHI with nearest neighbor classifier[20] | 63.5% |
HOG and LBP representation of MHI with SVM classifier[31] | 55% |
Harris 3D and HOG 3D with BOE[32] | 26.67% |
Harris 3D, HOG/ HOF, BoF with MCV-ELM[29] | 13.33% |
Harris 3D, HOG/ HOF, BoF with MV-ELM[30] | 13.33% |
|
MuHAVI-Uncut | LOAO | Proposed technique | 93.66% |
HOG representation of MHI with nearest neighbor classifier[20] | 84.1% |
Observable Markov model[33] | 83.90% |
The sequence of key poses[34] | 81.50% |
Learning discriminative key poses[35]. | 56.70% |
LOCO | Proposed technique | 82.04% |
Deep spatiotemporal representation of MHI with MCV-ELM[29] | 74.75% |
Deep spatiotemporal representation of MHI with MV-ELM[30] | 74.75% |
HOG representation of MHI with nearest neighbor classifier[20] | 52.2% |
The sequence of key poses [34] | 50.4% |
Learning discriminative key poses [35]. | 31.4% |
LOSO | Proposed technique | 97.02% |
HOG representation of MHI with nearest neighbor classifier[20] | 96.6% |
The sequence of key poses [34] | 86.5% |
Learning discriminative key poses [35]. | 56.6% |
|
IXMAS
| LOSO | Proposed technique | 71.94% |
Substructure and boundary modeling [36] | 76.5% |
Self-organizing map of action poses and fuzzy distance for MLP[37] | 89.9% |
The sequence of key poses [34] | 85.9% |
Multiview spatiotemporal histogram[38] | 81.4% |
Spatiotemporal volumes (3DSTVs) mapped to 4D[39] | 78% |
LOCO | Proposed technique | 74.52% |
Spatiotemporal visual words to learn SVM model[40] | 57.30% |
3D grid to learn HMM model for action recognition[41] | 57.90% |
Sphere and rectangular feature trees with nearest neighbor classifier[42] | 72.60% |
Histogram of silhouettes, horizontal and vertical optical-flow for action recognition[43] | 58.10% |
|