Distribution of gene ontology terms


Assigment:

You wish to understand the use and distribution of Gene Ontology terms in different subsections of Uniprot. Uniprot stores this data with each of its entries, each entry having potentially many Gene Ontology terms. So, we wish to reformat this data into XML for use with other downstream analytical tools.

You should, therefore, download all of Uniprot in some suitable format (you may choose which of their standard download formats you wish to use). Extract from this only the records associated with a single model organism. Reorganise the data, so that each GO term is associated with the records in which it occurs. Save this data out as an XML file. This process must be fully automated, and run without manual intervention, so that it can rerun as necessary.

Write a short report describing your methodology and the rationale behind it. You should submit this report and any associated scripts as your coursework.

Your report should be a maximum of 1000 words in length. You will score well by having:

  • working code
  • use of libraries
  • clear variable and function names
  • clear documentation

Request for Solution File

Ask an Expert for Answer!!
Other Subject: Distribution of gene ontology terms
Reference No:- TGS01434834

Expected delivery within 24 Hours