Mobile Information Systems

Review Article

Deep Learning on Computational-Resource-Limited Platforms: A Survey

Representative research works in the perspective of underlying principles.


	Representative research works	Techniques

Memory overhead induced by oversized network	[39]	Weight matrix compression of a pretrained network through clustering: merging similar functions in the hypothesis space
	[56]	Weight pruning of a pretrained network: removing the weights that contribute little to fitting functions in the hypothesis space
	[39, 58]	Sparse training: lasso regularization, structured sparsity regularization
	[68]	Computational optimization on digital computers: fine-grained utilization of memory

Time or energy overhead induced by backpropagation, memory operations, and hyperparameter tuning	[37, 39, 49]	Algorithmic design to avoid computation redundancy: depth separable convolution, avoidance of im2col reordering, factorized matrix-vector multiplication based on SVD and Tucker-2
	[37]	Caching of digital computers: reuse intermediate results of convolution to avoid redundant computation
	[39, 40]	Parallelization on digital processors: FPGA, GPGPU
	[37, 40, 53]	Full utilization of digital processors: profiling and fine-tuning of CPU or GPGPU codes
	[59]	Avoidance of frequent memory operations through Boolean logic minimization
	[41]	Hyperparameter tuning using Gaussian process

Curse of dimension	[53]	SVD decomposition of the weight matrix
Curse of dimension	[60]	Data embedding