FRI > Biolab > Supplements

Prediction accuracy of top ranked projections

We have compared the prediction accuracy of the best-ranked projections found by VizRank to four standard machine learning methods: support vector machines (SVM, with linear kernel), k-nearest neighbors (with k set to square root of the number of training instances), naive Bayesian classifier, and decision trees (Quinlan's C4.5 implementation used with the default parameters). The predictive accuracy was assessed on six cancer gene expression data sets using the bootstrap resampling repeated 100 times, as recommended by (Braga-Neto and Dougherty, 2003). The final performance scores were computed using the 0.632 bootstrap estimator as suggested in the same reference. The average classification accuracies derived in this way and the area under the ROC curve, with their respective standard deviations are shown in the following tables.

Classification accuracy

Data set VizRank SVM k-NN Naive Bayes Decision trees
Leukemia96.40 +- 4.33 97.57 +- 3.71 92.72 +- 6.74 84.34 +- 10.33 90.46 +- 5.52
DLBCL93.03 +- 5.67 97.85 +- 3.26 88.60 +- 6.29 83.76 +- 8.64 85.46 +- 9.10
Prostate94.00 +- 4.53 93.47 +- 4.60 84.51 +- 6.58 81.10 +- 9.00 85.47 +- 7.77
MLL95.00 +- 5.12 97.32 +- 3.21 89.65 +- 6.37 75.20 +- 9.67 88.31 +- 9.16
SRBCT96.39 +- 5.01 99.42 +- 2.35 86.29 +- 6.96 75.31 +- 10.58 87.32 +- 8.00
Lung cancer92.72 +- 3.40 94.67 +- 3.16 90.35 +- 3.44 75.28 +- 5.18 91.21 +- 5.09
Ranks1.83 1.17 3.5 5.003.50

AUC (Area under ROC)

Data set VizRank SVM k-NN Naive Bayes Decision trees
Leukemia 0.976 +- 0.040 0.997 +- 0.011 0.969 +- 0.049 0.819 +- 0.127 0.903 +- 0.069
DLBCL 0.946 +- 0.077 0.997 +- 0.010 0.925 +- 0.058 0.736 +- 0.095 0.818 +- 0.118
Prostate 0.961 +- 0.036 0.973 +- 0.026 0.912 +- 0.051 0.835 +- 0.091 0.870 +- 0.075
MLL 0.981 +- 0.030 0.998 +- 0.005 0.983 +- 0.020 0.860 +- 0.073 0.938 +- 0.059
SRBCT 0.989 +- 0.025 1.000 +- 0.001 0.978 +- 0.030 0.879 +- 0.076 0.942 +- 0.045
Lung cancer 0.969 +- 0.036 0.995 +- 0.007 0.974 +- 0.027 0.753 +- 0.054 0.935 +- 0.051
Ranks 2.33 1.00 2.67 5.00 4.00

To download the scripts and data sets used to obtain these results click here.