Review Article

Deep Learning on Computational-Resource-Limited Platforms: A Survey

Table 1

Representative research works in the perspective of underlying principles.

ā€‰Representative research worksTechniques

Memory overhead induced by oversized network[39]Weight matrix compression of a pretrained network through clustering: merging similar functions in the hypothesis space
[56]Weight pruning of a pretrained network: removing the weights that contribute little to fitting functions in the hypothesis space
[39, 58]Sparse training: lasso regularization, structured sparsity regularization
[68]Computational optimization on digital computers: fine-grained utilization of memory

Time or energy overhead induced by backpropagation, memory operations, and hyperparameter tuning[37, 39, 49]Algorithmic design to avoid computation redundancy: depth separable convolution, avoidance of im2col reordering, factorized matrix-vector multiplication based on SVD and Tucker-2
[37]Caching of digital computers: reuse intermediate results of convolution to avoid redundant computation
[39, 40]Parallelization on digital processors: FPGA, GPGPU
[37, 40, 53]Full utilization of digital processors: profiling and fine-tuning of CPU or GPGPU codes
[59]Avoidance of frequent memory operations through Boolean logic minimization
[41]Hyperparameter tuning using Gaussian process

Curse of dimension[53]SVD decomposition of the weight matrix
[60]Data embedding