The
purpose of this course will be to familiarize students with algorithms and
appropriate data structures for computational molecular biology. Potential
topics include algorithms for computing phylogenies. Candidate phylogenetic
algorithms include methods for calculating a parsimony criterion (Swofford et
al. 1996), Hadamard Conjugation (Hendy and Penny 1993), and Felsensteins' likelihood method
(1981) as well as distance based algorithms such as the least squares methods of
Bryant and Waddell (1998).
Another
area of interest is the area of algorithms for analyzing microarray data.
Potential algorithms for this topic include k-means clustering,
unsupervised hierarchical clustering (Eisen et al. 1998), multidimensional
scaling (Khan et al. 1998), graphical modeling including the use of APCR
methods (Waddell and Kishino 2000), correspondence analysis (Kishino and
Waddell 2000, Fellenberg et al. 2001) fuzzy self organizing feature maps
(Huntsberger and Ajjimarangsee 1990) as well as other neural (Gruvberger et al.
2001) and Bayesian networks (Friedman et al. 2000).
The target audience for this
course consists of students that either have an understanding of molecular
biology and are interested in the algorithms and techniques used in
bioinformatics data analysis or students with a background in computational
science that are interested in investigating another area of computational
science.
Each student will be
expected to make a class presentation in which they describe a particular
algorithm. Students will also use the algorithms covered in class to perform
data analysis.
Swofford, D.L., G.J. Olsen, P.J.
Waddell, and D.M. Hillis (1996). Phylogenetic Inference. In:
"Molecular Systematics, second edition" (ed. D. M. Hillis and C.
Moritz), pp 450-572. Sinauer Assoc, Sunderland, Mass.
Hendy, M.D., and D. Penny, Spectral analysis of phylogenetic
data, Journal of Classification, v.10, 1993, 5-24.
Felsentein, J., Evolutionary trees from DNA sequences: a
maximum likelihood approach, J. Mol. Evol., v17, 1981, 368-76.
Bryant, D., and P.J. Waddell. (1998). Rapid
evaluation of least squares and minimum evolution criteria on phylogenetic
trees. Molecular Biology and Evolution 15: 1346-1359. (see also Dept. of
Mathematics and Statistics Research Report Series, No. 167, University of
Canterbury, for an alternative development of the main results).
Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression
patterns. Proc Natl Acad Sci U S A. 1998 Dec 8;95(25):14863-8.
Khan, J., Simon, R., Bittner, M., Chen, Y., Leighton, S.B.,
Pohida, T., Smith, P.D., Jiang, Y., Gooden, G.C., Trent, J.M., and P.S.
Meltzer. Gene expression profiling of alveolar rhabdomyosarcoma with cDNA
microarrays. Cancer Res., 58: 5009-5013, 1998.
Waddell , P.J., and H. Kishino.
(2000). Cluster Inference Methods and Graphical Models evaluated on
NCI60 Microarray Gene Expression Data. Genome Informatics Series 11:
129-141.
Kishino, H., and P.J. Waddell.
(2000). Correspondence Analysis of Genes and Tissue
Types and Finding Genetic Links from Microarray Data. Genome
Informatics Series 11: 83-96.
Fellenberg K, Hauser NC, Brors B, Neutzner A,
Hoheisel JD, Vingron M. Correspondence analysis applied to microarray data.
Proc Natl Acad Sci U S A. 2001 Sep 11;98(19):10781-6.
Huntsberger, T.L., and P. Ajjimarangsee. Parallel
self-organizing feature maps for unsupervised pattern recognition.
International Journal of General Systems 16 (1990), pp 357-372.
Gruvberger,
S., Ringner, M., Chen, Y., Panavally, S., Saal, L.H., Borg, A., and M. Ferno,
"Estrogen Receptor Status in Breast Cancer Is Associated with Remarkably
Distinct Gene Expression Patterns", Cancer Research 61, pp 5979-5984, 2001.
Friedman N,
Linial M, Nachman I, Pe'er D. Using Bayesian networks to analyze expression
data. J Comput Biol. 2000;7(3-4):601-20.