English

Algorithm for Finding Optimal Gene Sets in Microarray Prediction

Biological Physics 2007-05-23 v1 Computational Physics Medical Physics q-bio

Abstract

Motivation: Microarray data has been recently been shown to be efficacious in distinguishing closely related cell types that often appear in the diagnosis of cancer. It is useful to determine the minimum number of genes needed to do such a diagnosis both for clinical use and to determine the importance of specific genes for cancer. Here a replication algorithm is used for this purpose. It evolves an ensemble of predictors, all using different combinations of genes to generate a set of optimal predictors. Results: We apply this method to the leukemia data of the Whitehead/MIT group that attempts to differentially diagnose two kinds of leukemia, and also to data of Khan et. al. to distinguish four different kinds of childhood cancers. In the latter case we were able to reduce the number of genes needed from 96 down to 15, while at the same time being able to perfectly classify all of their test data. Availability: http://stravinsky.ucsc.edu/josh/gesses/ Contact: [email protected]

Keywords

Cite

@article{arxiv.physics/0108011,
  title  = {Algorithm for Finding Optimal Gene Sets in Microarray Prediction},
  author = {J. M. Deutsch},
  journal= {arXiv preprint arXiv:physics/0108011},
  year   = {2007}
}

Comments

21 pages 8 figures latex