If you get a particular protein named ‘Keratin’. How will you retrieve its (a) Nucleic acid sequence (b) Protein sequence (c) Carbohydrate binding site, if present, (d) Protein chains (e) Amino acid frequency etc.? Describe briefly?
All the required information concerning any protein (i.e., keratin) can be obtained from the appropriate databases.

(a) Nucleic acid sequence encoding ceratin (gene and cDNA or mRNA) can be obtained from NCBI and Ensemble databases. These services contain the complete sequences of all human genes, as well as genes present in other organisms.

(b) Protein sequence can be also found in these databases (NCBI, Ensemble), as well as UniProt database. On the other hand, protein sequence can be retrieved by a simple translation of cDNA or mRNA sequence using ExPASy translation tool. In general, a reading frame represented by the longest translation product corresponds to the correct protein sequence.

(c) / (d) Both carbohydrate-binding site and protein chains are related to the structural features of the protein that can be retrieved from RCSB PTB database containing 164174 biological macromolecular structures, as well as their structural and functional features.

(e) Amino acid frequency can be calculated using the ExPASy ProtParam tool that calculates the percentage of each amino acid in the protein while the one-letter amino acid sequence is used as an input.

