Answer to Question #131758 in Bioinformatics for Usman Khan

Question #131758
it is noticed that major sequence alignments differ in approach computational complexity and accuracy. Do you agree with this? Explain it with examples
1
Expert's answer
2020-09-07T06:25:15-0400

There are many algorithms and software programs implemented for the inference of multiple sequence alignments of protein and DNA sequences. Alignment databases provide a source of accurate alignments to gauge the accuracy and speed of different programs, but they also present several disadvantages. Even though the databases' alignments are manually curated, there is still the possibility of misalignments which would result in accuracy assessment problems.  Real protein sequence data often contains non-homologous terminal ends and/or incomplete sequences. There`s the effect of large terminal gaps on alignment accuracy.

So lets talk about examples

ProbCons (Probabilistic Consistency-based multiple sequence alignment)

ProbCons is the only program that uses a probabilistic consistency method of alignment. It is a modification of the traditional sum-of-pairs scoring system, and in addition incorporates a pair-hidden Markov model-based progressive alignment algorithm. The alignment procedure is divided into four steps, starting with a computation of posterior-probability matrices for every pair of sequences. This is followed by a dynamic programming calculation of the expected accuracy of every pairwise alignment.

Dialign-T

Dialign-T does not use pre-calculated tables in order to obtain weight scores: it calculates probability tables from several substitution matrices. Additionally, the greedy-like multiple alignment algorithm from Dialign2.2 was changed in order to avoid spurious local similarities. 

Muscle

Muscle uses a pairwise profile alignment approach. The program first builds a progressive alignment which is then improved and refined in two subsequent stages. The progressive alignment is created after the sequence similarities, a distance estimation and a UPGMA tree are calculated.

Clustal W

This is probably the most widely used alignment program and oldest among the packages tested. The software performs a progressive alignment, first employing a pairwise sequence comparison by calculating a distance matrix that stores sequence divergence. After this matrix is obtained, a tree guide is built using Neighbor Joining, followed by the third and final step where sequences are aligned according to the branch order in the guide tree.  There are gap penalties are mainly dependent on factors such as the weight matrix, sequence length and similarity


Need a fast expert's response?

Submit order

and get a quick answer at the best price

for any assignment or question with DETAILED EXPLANATIONS!

Comments

No comments. Be the first!

Leave a comment

LATEST TUTORIALS
New on Blog
APPROVED BY CLIENTS