ResearchClassification-based comparison of pre-processing methods for interpretation of mass spectrometry generated clinical datasets1 Department of Gynaecologic Oncology, Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands 2 Bioinformatics Laboratory, Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands 3 Clinical Proteomics Group, Department of Medical Biochemistry, Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands 4 Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, the Netherlands
Proteome Science 2009, 7:19doi:10.1186/1477-5956-7-19
Additional filesAdditional file 1: Variance analysis (ovarian cancer dataset). Boxplots of the coefficient of variation (CV, standard deviation/mean peak intensity). Left panel: CV for all combinations of pre-processing method (Ciphergen: cyan, Cromwell: red) and peak selection setting (A, B, C) for the CM10 chip. Right panel: idem for Q10 chip. Format: PDF Size: 25KB Download file This file can be viewed with: Adobe Acrobat Reader Additional file 2: Variance analysis (Gaucher dataset). Boxplots of the coefficient of variation (CV, standard deviation/mean peak intensity). CV for all combinations of pre-processing method (Ciphergen: cyan, Cromwell: red) and peak selection setting (A, B, C). Format: PDF Size: 18KB Download file This file can be viewed with: Adobe Acrobat Reader Additional file 3: Cumulative plot of significance of detected peaks (ovarian cancer dataset. CM10). For each combination of pre-processing method and peak selection settings, the cumulative percentage of peaks with a p-value smaller than the value on the x-axis are shown. P-value of a peak is based on a t-test between the normalized intensities of the cancer and the control group. Format: PDF Size: 31KB Download file This file can be viewed with: Adobe Acrobat Reader Additional file 4: Cumulative plot of significance of detected peaks (ovarian cancer dataset, Q10). For each combination of pre-processing method and peak selection settings, the cumulative percentage of peaks with a p-value smaller than the value on the x-axis is shown. P-value of a peak is based on a t-test between the normalized intensities of the cancer and the control group. Format: PDF Size: 31KB Download file This file can be viewed with: Adobe Acrobat Reader Additional file 5: Cumulative plot of significance of detected peaks (Gaucher dataset). For each combination of pre-processing method and peak selection settings, the cumulative percentage of peaks with a p-value smaller than the value on the x-axis is shown. P-value of a peak is based on a t-test between the normalized intensities of the Gaucher and the control group. Format: PDF Size: 31KB Download file This file can be viewed with: Adobe Acrobat Reader Additional file 6: Comparison of classifiers and pre-processing methods (ovarian cancer dataset). Average classification accuracy on 1000 test sets (size of training sets: 14, size of test sets: 14) for each specific combination of pre-processing method and peak selection settings. Format: XLS Size: 24KB Download file This file can be viewed with: Microsoft Excel Viewer Additional file 7: Comparison of classifiers and pre-processing methods (Gaucher dataset). Average classification accuracy on 500 test sets (size of training sets: 27, size of test sets: 12) for each specific combination of pre-processing method and peak selection settings. Format: XLS Size: 19KB Download file This file can be viewed with: Microsoft Excel Viewer Additional file 8: Comparison of classifiers and pre-processing methods (ovarian cancer dataset). Each combination of chip type, pre-processing method and peak selection was ranked by its average classification accuracy on 1,000 test sets (size of training sets: 14, size of test sets: 14) for each classifier. The heatmap gives a colour coding of the ranks from 1 (highest accuracy, red) to 18 (lowest accuracy, light yellow). Columns of the heatmap are ranked by their average rank over all classifiers, with Ciphergen pre-processing using setting C and the combined CM10/Q10 data getting the highest rank. Classifiers are ordered by their average rank over all pre-processing combinations, with DLDA being the best ranked classifier. Format: PDF Size: 76KB Download file This file can be viewed with: Adobe Acrobat Reader Additional file 9: Comparison of classifiers and pre-processing methods (Gaucher dataset). Each combination of pre-processing method and peak selection was ranked by its average classification accuracy on 500 test sets (size of training sets: 27, size of test sets: 12) for each classifier. The heatmap gives a colour coding of the ranks from 1 (highest accuracy, red) to 6 (lowest accuracy, light yellow). Columns of the heatmap are ranked by their average rank over all classifiers, with Ciphergen pre-processing using setting A getting the highest rank. Classifiers are ordered by their average rank over all pre-processing combinations, with SVM being the best ranked classifier. Format: PDF Size: 44KB Download file This file can be viewed with: Adobe Acrobat Reader |





on Google Scholar









author email
corresponding author email