Review Article

Current State of Non-wearable Sensor Technologies for Monitoring Activity Patterns to Detect Symptoms of Mild Cognitive Impairment to Alzheimer’s Disease

Table 4

Summary of performance metrics corresponding to best performed machine learning technique reported in each study.

StudyTargetMachine learning techniqueModel evaluation including cross-validation methodClassifier metricsRegression metricsKey observationsSoftware used for analysis

Alberdi et al. [22]Predict the absolute test score using sensor-derived features (regression)SVR (RBF)For both regression and classification analysis, 10-fold CV was performed, and models were tested with the internal dataset gathered; no mention of evaluating the model on any external dataset.NACorrelation coefficient (): 0.55
MAE: 5
For mobility tests, TUG demonstrated a moderate to strong correlation with sensor-derived behavioral features.(i) R studio for computing time series statistics
(ii) Weka for prediction modeling (correlation and classification analysis)
Detect a reliable change in health assessment scores; no decline (+ve class) vs. decline (-ve class)RF (on PCA reduced dataset)Recall: 28%
-score: 0.33
AUC-ROC: 0.65
AUC-PRC: 0.54
NAPerson’s improvement/decline in mobility domain detected from sensor derived features indicates early symptoms for MCI.
Schinle et al. [23]Normal vs. slight anomaly vs. severe anomaly (activity)Local outlier factorNSAccuracy: 90.9% for wake-up time and 93.2% for bed timeNADeviation in any of the learned wake-up time/bed time/night time activity profiles (outlier) indicates the anomaly.NS
Sharma et al. [24]Normal vs. abnormal (routine)RNNNo CV was reported for evaluating the performance of predictor. CV was included in the training process of RNN model that filled in missing values.NSNAAbnormality was detected by using daily routine vector comprising of sensor values. Classifier performance was not explained.NS
Akl et al. [25]CIN vs. MCIAffinity propagationAs a first step, model was trained and tested with this 80 : 20 split; as a second step, in order to find the validity of the model, a leave-one-subject-out CV was performed using only the 22 subjects transitioned to MCI during study period. No mention of any external dataset for evaluation.0.5 score: 0.789NAA time frame of 20 weeks was found to be the duration that generates room activity distributions that are most conducive to detecting MCI in older adults.NS
Akl et al. [26]MCI vs. CHSVM (RBF)Entire dataset was divided into three groups for a 3-fold CV such that each group had approximately the same total number of data points (feature vectors) pertaining to each class (cognitively intact and MCI). No mention of any external dataset for evaluation.AUC-ROC: 0.97
AUC-PRC: 0.93
NAWalking speed-related features were more effective in predicting the MCI condition than any other features. Analyzing activity features for a time window of 24 weeks yielded a best performance.MATLAB
Albeiruti et al. [27]Normal vs abnormal (behavior trend)HMMInitial behavioral model was created with the entire dataset. Later, the resulting model was used as a standard model to be compared with the same dataset days one by one. There is no clarity as to what were the ground truth in evaluating the outcomes of this approach.NSNAPerson’s movement transitions from one location to another location were used for behavior modeling. Classifier performance was not explained.MATLAB
Lussier et al. [28]Predict the MCI diagnosis variable using the sensor-based iADL measures and expert rated performance scores (regression)Regression analysisNSNA2: 0.47
: 22.01 ()
Sensor-based iADL measures represent time spent in activities related to mobility, hygiene, and cookingIBM SPSS
Arifoglu et al. [29]Abnormal vs. normal behaviorLSTMEntire dataset was divided into 3 partitions one each for training, validation, and testing by fixed number of days, i.e., 139 days : 15 days : 70 days on 224 days of monitoring. No mention of any external dataset for evaluation.Sensitivity: 98.67%
Specificity: 75.48%
NAStudy results report that LSTM are more suitable to detect repetition and order-related abnormal activities since it can relate current input with the upcoming ones.Keras Deep Learning library’s and Theano’s implementation of CNNs and LSTM
Paudel et al. [30]CH vs. MCIRFA 10-fold CV was performed with the internal dataset gathered; no mention of any external dataset for model evaluation.Accuracy: 73%
Precision: 73%
Sensitivity: 73%
Specificity: 73%
-score: 0.73
AUC-ROC: 0.72
NAPython using the sci-kit learning tool
Gochoo et al. [31]Direct vs. pacing vs. lapping vs. random (travel pattern)Deep CNNA 10-fold CV was performed with the internal dataset gathered; no mention of any other external dataset for model testing/evaluation.Accuracy: 97.84%
Precision: 97.9%
Sensitivity: 97.8%
Specificity: 99.3%
-score: 0.978
NANS
Li et al. [32]Dementia vs. nondementia (nondementia includes CH and MCI)Bayesian networkLeave-one-out CV was performed with the internal dataset gathered; no mention of any other external dataset for model testing/evaluation.Precision: 98.3%
Sensitivity: 98.3%
AUC-ROC: 0.851
NAThe basis of this study is that the moving trajectories of the older adults with dementia are different from those of without dementia.NS
Dawadi et al. [33]Map the sensor-derived activity features to the direct observation scores (regression)SVM regression (with bagging)NSNACorrelation coefficient (): 0.58The correlation () between smart home sensor derived features and task accuracy scores was found to be statistically significant.
MCI vs. CHSVM (with cost sensitive learning)Leave-one-out CV with internal dataset gathered; no mention of any external dataset for evaluation.-score (class A): 0.37
-score (class B): 0.78
NAClassification performance was not strong due to the individuals in these two groups do have quite a bit of overlap in functional performance (activities).
Dementia vs. CHSVMLeave-one-out CV with internal dataset gathered; no mention of any external dataset for evaluation.-score (class A): 0.93
-score (class B): 0.99
NAClassification performance was best because the individuals in these two groups exhibited a vast difference in performing the scripted tasks/activities.

AUC: area under the curve; CH: cognitively healthy; CIN: cognitively intact; CNN: Convolutional Neural Network; CV: cross-validation; HMM: Hidden Markov Model; iADL: instrumental activities of daily living; LSTM: long short-term memory; MAE: mean absolute error; MCI: mild cognitive impairment; NA: not applicable; NS: not specified; PCA: principal component analysis; PRC: precision-recall curve; RBF: radial basis function; RF: Random Forest; ROC: receiver operating characteristic curve; SVM: Support Vector Machine; SVR: Support Vector Regression; TUG: Timed Up and Go.