Toward a General-Purpose Heterogeneous Ensemble for Pattern Classification
Table 2
UCI datasets and their features: number of attributes (#A), number of samples (#S), and number of classes (#C).
Dataset
Acronym
#A
#S
#C
Brief description
BREAST
BR
9
699
2
For breast tumor diagnosis
HEART
HE
13
303
2
For detecting heart disease; the “goal” field refers to the presence of heart disease in the patient
PIMA
PI
8
768
2
For forecasting the onset of diabetes mellitus
Spam
SP
57
4601
2
For classifying E-mail as spam or nonspam
SONAR
SO
60
208
2
For discriminating between sonar signals bounced off a metal cylinder and those bounced off a rough cylindrical rock
IONOSPHERE
IO
34
351
2
For classifying radar returns from the ionosphere
Liver
LI
7
345
2
For classifying liver disorders that might arise from excessive alcohol consumption
Haberman
HA
3
306
2
A dataset that contains cases on the survival of patients who had undergone surgery for breast cancer
Vote
VO
16
435
2
For classifying Republican versus Democrat US representatives (this dataset includes votes for each member of the US House of Representatives on 16 key votes)
Australian
AU
14
690
2
For credit card applications
Transfusion
TR
5
748
2
This study adopted the donor database of Blood Transfusion Service Center; the aim is to predict whether a person donated blood in March, 2007