Classification, clustering, and phylogeny estimation
Wed, 2 May 2001 12:29:18 -0500
We have the following problem in scoring multiple aligned proteins and
generating a similarity matrix.
For i = 1,2,...,20 amino acids, a substition of amino acid i -> j results in a
similarity s_ij. Note that s_ii >> 0 for most amino acids (i.e., amino acid i
has similarity to itself of s_ii >> 0).
We can generate a similarity 'score' matrix for a set of N proteins but the
diagonals >> 1. We would like to scale this matrix so each diagonal is 1, and
each off-diagonal element is 0 <= s_ij <= 1.
Thanks for any suggestions.
PS -- for protein alignment experts we are using the BLOSUM62 score matrix and
working with already multiple aligned proteins.
William D. Shannon, Ph.D.
Assistant Professor of Biostatistics in Medicine
Division of General Medical Sciences and Biostatistics
Washington University School of Medicine
Campus Box 8005, 660 S. Euclid
St. Louis, MO 63110
e-mail: [log in to unmask]
web page: http://ilya.wustl.edu/~shannon