Research Article

Malware Detection on Byte Streams of PDF Files Using Convolutional Neural Networks

Table 2

Description of comparative classifiers and parameter settings.

ModelDescription

Naïve bayes (NB)(i) Probabilistic classifier based on the bayes’ theorem
(ii) Assumes the independence between features

Decision tree (DT)(i) C4.5 classifier using J48 algorithm
(ii) Confidence factor for pruning: 0.25

Support vector machine (SVM)(i) Non-probabilistic binary classification model that finds a decision boundary with a maximum distance between two classes
(ii) Kernel: Poly
(iii) Exponent=1.0, Complexity c=1.0
(iv) Trained via sequential minimal optimization algorithm [11]

Random forest (RF)(i) Kind of ensemble model that generates final result by incorporating results of multiple decision-trees
(ii) #trees=100
(iii) #features=log(#trees)+1
(iv) Each tree has no depth-limitation