CLASS-L Archives

May 2001

CLASS-L@LISTS.SUNYSB.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Uta Bohnebeck <[log in to unmask]>
Reply To:
Classification, clustering, and phylogeny estimation
Date:
Thu, 3 May 2001 08:35:19 +0200
Content-Type:
text/plain
Parts/Attachments:
text/plain (83 lines)
Hi,
in this original BLOSUM62 score matrix the identities of the
amino acids are weighted differently (I suppose, this comes from their
different distribution in nature). In the problem you described
you want to scale all identities to 1 and you will lost this
information, right?

What are the priorities in this transformation? Do you want to
get metric properties? Or do you want to sustain the different
distribution of the amino acids?

Can you give us few words about why do you want to transform
this matrix and how do you want to use it?

Uta Bohnebeck

***************************************************************
Uta Bohnebeck                 Tel:      +49-421-218-7838/ -7090
Universität Bremen            Fax:      +49-421-218-7196
TZI,  IS / AG-KI              [log in to unmask]
Universitätsallee 21-23
Postfach 330 440
D-28334 Bremen
---------------------------------------------------------------
http://www.informatik.uni-bremen.de/~bohnebec/home.html
---------------------------------------------------------------


-----Ursprüngliche Nachricht-----
Von: Classification, clustering, and phylogeny estimation
[mailto:[log in to unmask]]Im Auftrag von William Shannon
Gesendet: Mittwoch, 2. Mai 2001 19:40
An: [log in to unmask]
Betreff: More protein stuff


As a follow-up to my previous email the first 5 rows,columns of the score
matrix
 are:

> blosum62[1:5,1:5]
   A  R  N  D  C
A  4 -1 -2 -2  0
R -1  5  0 -2 -3
N -2  0  6  1 -3
D -2 -2  1  6 -3
C  0 -3 -3 -3  9


Comparing sequences ARN to ADC gives similarity scores:

        (s_AA = 4) + (s_RD = -2) + (s_NC = -3) = -1

and ARN to itself

        (s_AA = 4) + (s_RR = 5) + (s_NN = 6) = 15

and ADC to itself

        (s_AA = 4) + (s_DD = 6) + (s_CC = 9) = 19

so the similarity matrix is

        15  -1
        -1  19


--

William D. Shannon, Ph.D.

Assistant Professor of Biostatistics in Medicine
Division of General Medical Sciences and Biostatistics

Washington University School of Medicine
Campus Box 8005, 660 S. Euclid
St. Louis, MO   63110

Phone: 314-454-8356
Fax: 314-454-5113
e-mail: [log in to unmask]
web page: http://ilya.wustl.edu/~shannon

ATOM RSS1 RSS2