Define zero and first order markov models for seqeuence1txt


Question:

a) Define zero and first order Markov models for seqeuence1.txt. The sequence is Mycobacterium tuberculosis gene mtb48

Hints:
- Zero order Markov model is defined by P(i), where i= {A,T,G,C}
For this you simply need the nucleotide counts and total number of nucleotides.
Zero order Markov model for DNA sequence should have four parameters
- First order Markov Model is defined by P(i|j), where i,j ={A,T,G,C}. For example P(A|T) is probability of observing A after T in DNA sequence
For this you'll need the number of occurrences of di-nucleotides and the total number of di-nucleotides
First order Markov model for DNA sequence should have sixteen parameters.
- To implement this, it would be easiest to write a small script in R using a alphabetFrequency() and dinucleotideFrequency() function of the Biostrings package. Or you can use perl or any other programming language of your choice. Otherwise, if you really have to (you exhausted all the options, see no other way and hopelessly behind on your schedule) you can use Microsoft Word or Excel substitute function or MS word's find/replace.

b) Using models you derived in (a) determine the probability of DNA fragment AGTAGCTTCCAG.

Solution Preview :

Prepared by a verified Expert
Basic Statistics: Define zero and first order markov models for seqeuence1txt
Reference No:- TGS02544916

Now Priced at $25 (50% Discount)

Recommended (94%)

Rated (4.6/5)