For those out there who know stochastic processes better than me (and that is probably everyone), can you tell me if the following problem has an analogy in the stochastic process literature. If it does this will save me having to invent something:

 

1.       Suppose we have N dart boards (virus genomes) lined up

2.       We throw a dart (random virus sequence) at a specific dart board

3.       The dart hits that targeted dart board but also hits (my darts can hit many dart boards with a single throw) some arbitrary number of the other non-target dart boards (conserved sequences)

4.       When we start throwing darts we may target any number of different boards (first dart targeted at board 10, second dart targeted at board 3, third dart targeted at board 10, etc)

5.       We expect for any set of darts that there will be a small number of boards targeted each with a different frequency

6.       We expect (roughly) the non-targeted boards to be hit randomly

 

I want to have a formal mathematical model for deciding which boards (virus genomes) have been targeted and how many darts (random genome sequences) need to be thrown to make my decision.  

 

Thank you

 

Bill Shannon, PhD

Associate Prof. of Biostatistics in Medicine

Washington University School of Medicine

660 South Euclid Ave, Box 8005

St. Louis, MO 63110

 

[log in to unmask]" target="_blank">[log in to unmask]

 

---------------------------------------------- CLASS-L list. Instructions: http://www.classification-society.org/csna/lists.html#class-l