Develop conditional random field model which assess protein


Homework

1. Project Aim:

• Develop a conditional random field model which can assess protein functionally utilizing a protein family.
• Protein family acts as a database for scoring new protein sequences for functionality.

2. What are Graphical CRFs?

• More powerful than HMMs due to their application of feature functions.

• Undirected graphical model.

• Has a single exponential model for the joint probability of the entire sequence of labels given the observation sequence.

• Linear CRFs, like HMMs, only impose dependencies on the previous element whereas with general CRFs we can impose dependencies to arbitrary elements.

3. Applications of CRFs

• Natural Language processing
• Parts-of-speech tagging
• Name Entity recognition
• Prediction sequences
• Gene prediction

4. CRF options

• RNNSharp: CRFs based on recurrent neural networks
• CRF-ADF: Linear-chain CRFs with fast online ADF training
• CRFSharp: Linear-chain CRFs
• GCO: CRF with submodular energy functions
• DGM: General CRFs
• HCRF library: Hidden-state CRFs
• PyStruct: Structured Learning and prediction library in Python

5. Advantages

• Design is flexible

o No strict independence assumptions like HMM

• Overcomes the drawbacks of label bias in MEMM

o Computes the conditional probability of global output nodes

• Computes the joint probability distribution

6. Disadvantages

• Highly computationally complex at the training stage
• Difficult to re-train data with newer data

Format your homework according to the following formatting requirements:

o The answer should be typed, using Times New Roman font (size 12), double spaced, with one-inch margins on all sides.

o The response also includes a cover page containing the title of the homework, the student's name, the course title, and the date. The cover page is not included in the required page length.

o Also include a reference page. The Citations and references must follow APA format. The reference page is not included in the required page length.

Solution Preview :

Prepared by a verified Expert
Other Subject: Develop conditional random field model which assess protein
Reference No:- TGS03139867

Now Priced at $35 (50% Discount)

Recommended (95%)

Rated (4.7/5)