please answer the following three questions on


Please answer the following three questions on Sequence Z:

Metadata

The GO Ontology is a very widely-used resource in the bioinformatics community as a tool to annotate genes and their products. Websites serving genome databases such as TAIR use GO to annotate genes and other biological entities to enrich the data stored within by using semantic metadata.

1. BLAST Sequence Z into TAIR - from which gene does this sequence derive?

2. What are the Molecular Function terms that this gene is thought to have?

3. What are the GO IDs for these terms?

4. How do you think that biologists can benefit from the annotation of biological data with metadata in an ontology such as GO?

5. How could a bioinformatician exploit metadata such as GO terms programmatically?

Perl scripting

BLAST Sequence Z into EMBL-Bank and retrieve the flat file (Text view) output of the record. Then write a Perl script to read in the flat file and to write out the following fields:

1. The Accession Number for this record

2. The Description of the entry

3. Any Database Cross-references to InterPro records (i.e. InterPro Accession Numbers)

4. The Protein ID in the Feature Table

5. The length (in base pairs) of the nucleotide sequence

Please append your code to your coursework script.

Microarray databases

1. Which is the Affymetrix probe ID of this gene?

2. Using Genevestigator, answer the following:

  • In which developmental stage is the expression of this gene at it's highest?
  • In which part of the root does this gene typically exhibit higher expression:

   The lateral root or the endodermis?

3. Explain what criteria other than co-expression you'd want to use in order to be convinced that two or more genes are truly transcriptionally co-regulated?

4. Name two uses of microarray technology apart from transcriptomics. Briefly describe (1/3 page each) each technique.

Here is a coding sequence in fasta format:

>Sequence Z

ATGTGGAGGCTGAGAACTGGACCGAAGGCTGGAGAGGATACTCACCTGTTCACCACCAAC

AACTATGCAGGGAGGCAGATTTGGGAATTTGATGCCAACGCAGGCTCTCCACAAGAAATT

GCCGAGGTAGAGGATGCTCGGCACAAATTCTCAGACAACACGTCACGTTTCAAGACTACT

GCCGATCTCTTATGGCGCATGCAGTTTCTTAGGGAGAAGAAATTCGAACAGAAGATTCCA

CGAGTGATAATCGAGGATGCAAGAAAGATAAAGTACGAAGATGCAAAGACAGCATTGAAA

AGAGGGTTACTCTATTTCACAGCCTTGCAGGCTGATGATGGACACTGGCCAGCTGAAAAC

TCTGGCCCAAATTTCTATACCCCTCCTTTTTTGATATGCTTGTACATCACTGGACATCTG

GAGAAAATCTTCACTCCCGAGCATGTTAAAGAGTTACTACGTCACATCTACAACATGCAG

AACGAAGATGGTGGGTGGGGTTTACACGTAGAAAGCCACAGTGTTATGTTCTGTACAGTC

ATTAATTACGTCTGTCTACGAATTGTGGGAGAAGAAGTCGGTCATGATGATCAAAGAAAT

GGTTGTGCAAAGGCTCATAAGTGGATCATGGACCATGGTGGTGCTACCTACACGCCCTTG

ATCGGAAAAGCGTTGCTTTCGGTTCTTGGAGTGTATGATTGGTCTGGCTGCAATCCTATA

CCTCCAGAGTTCTGGTTGCTTCCGTCTTCTTTTCCTGTTAATGGAGGGACTCTCTGGATT

TATTTACGGGATACTTTCATGGGGTTGTCATACTTGTATGGTAAAAAATTTGTGGCTCCC

CCAACACCTCTCATTCTCCAGCTCCGAGAAGAGCTTTATCCGGAGCCTTATGCAAAAATC

AATTGGACGCAAACACGAAACCGATGTGGAAAGGAAGATCTCTACTATCCACGCTCATTT

TTACAAGATTTGTTTTGGAAGAGTGTTCACATGTTCTCAGAGAGTATCCTAGATCGATGG

CCTTTAAACAAGCTAATAAGACAAAGAGCTCTTCAATCCACTATGGCACTCATTCACTAT

CATGACGAATCCACCAGATATATTACAGGCGGATGCCTGCCAAAGGCCTTTCATATGCTT

GCATGTTGGATAGAAGACCCTAAGAGTGATTATTTTAAAAAACATCTTGCTCGAGTTCGC

GAATACATATGGATTGGCGAGGATGGCCTGAAAATTCAATCTTTTGGTAGCCAATTATGG

GATACAGCCTTATCGCTACATGCATTACTAGACGGAATTGATGATCATGATGTTGATGAT

GAGATTAAAACAACGCTCGTTAAAGGATATGATTACTTGAAGAAATCACAAATTACAGAG

AACCCTCGCGGTGATCACTTCAAAATGTTTCGTCACAAGACAAAAGGTGGATGGACATTT

TCAGATCAAGATCAAGGATGGCCTGTTTCAGATTGTACTGCTGAAAGCTTAGAGTGTTGT

CTATTCTTCGAGAGCATGCCGTCCGAGCTTATTGGAAAAAAAATGGATGTGGAGAAACTC

TATGATGCCGTTGATTATCTTCTCTATCTGCAGAGTGATAATGGAGGCATAGCAGCATGG

CAACCAGTTGAAGGAAAAGCCTGGTTAGAGTTGTTAAATATCATGATTTTTAGGTATGTA

GAATGTACGGGGTCAGCGATTGCAGCATTGACTCAGTTTAACAAACAGTTTCCAGGGTAT

AAAAACGTAGAGGTTAAACGGTTTATAACAAAGGCTGCAAAGTACATTGAAGACATGCAA

ACGGTGGATGGTTCATGGTACGGAAATTGGGGAGTGTGTTTTATATACGGGACCTTCTTT

GCGGTAAGAGGTCTTGTGGCCGCTGGGAAGACTTACAGTAACTGTGAAGCAATTCGTAAA

GCAGTTCGTTTTCTTCTAGACACACAAAATCCGGAGGGTGGCTGGGGAGAGAGCTTTCTC

TCTTGTCCAAGCAAGAAATATACTCCTTTGAAAGGAAACAGCACAAATGTGGTGCAAACA

GCACAAGCACTTATGGTGCTAATTATGGGTGATCAGATGGAGAGAGATCCTTTACCGGTT

CATCGTGCTGCTCAAGTGTTGATCAATTCACAGTTGGATAATGGCGATTTTCCACAGCAG

GAAATAATGGGAACGTTCATGAGAACTGTGATGCTCCATTTTCCGACCTATAGGAACACG

TTCTCTCTTTGGGCTCTCACACATTACACACATGCTCTGCGACGTCTCCTCCCTTAA

 

Request for Solution File

Ask an Expert for Answer!!
Biology: please answer the following three questions on
Reference No:- TGS0209104

Expected delivery within 24 Hours