HMMER

This is an old revision of this page, as edited by Dcirovic (talk | contribs) at 08:11, 31 May 2016 (top: clean up using AWB). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

HMMER is a free and commonly used software package for sequence analysis written by Sean Eddy.[1] Its general usage is to identify homologous protein or nucleotide sequences. It does this by comparing a profile-HMM to either a single sequence or a database of sequences. Sequences that score significantly better to the profile-HMM compared to a null model are considered to be homologous to the sequences that were used to construct the profile-HMM. Profile-HMMs are constructed from a multiple sequence alignment in the HMMER package using the hmmbuild program. The profile-HMM implementation used in the HMMER software was based on the work of Krogh and colleagues.[2] HMMER is a console utility ported to every major operating system, including different versions of Linux, Windows, and Mac OS.

HMMER
Developer(s)Sean Eddy
Stable release
3.0 / 28 March 2010 (2010-03-28)
Preview release
3.1b1 / May 2013; 11 years ago (2013-05)
Repository
Written inC
Available inEnglish
TypeBioinformatics tool
LicenseGPL
Websitehmmer.janelia.org

HMMER is the core utility that protein family databases such as Pfam and InterPro are based upon. Some other bioinformatics tools such as UGENE also use HMMER.

HMMER3 is complete rewrite of the earlier HMMER2 package, with the aim of improving the speed of profile-HMM searches. The main performance gain is due to a heuristic filter that finds high-scoring un-gapped matches within database sequences to a query profile. This heuristic results in a computation time comparable to BLAST with little impact on accuracy. Further gains in performance are due to a log-likelihood model that requires no calibration for estimating E-values, and allows the more accurate forward scores to be used for computing the significance of a homologous sequence.[3]

HMMER3 also makes extensive use of vector instructions for increasing computational speed. This work is based upon earlier publication showing a significant acceleration of the Smith-Waterman algorithm for aligning two sequences.[4]

See also

Several implementations of profile HMM methods and related position-specific scoring matrix methods are available. Some are listed below:

References

  1. ^ Durbin, Richard; Sean R. Eddy; Anders Krogh; Graeme Mitchison (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press. ISBN 0-521-62971-3.
  2. ^ Krogh A, Brown M, Mian IS, Sjölander K, Haussler D (February 1994). "Hidden Markov models in computational biology. Applications to protein modeling". J. Mol. Biol. 235 (5): 1501–31. doi:10.1006/jmbi.1994.1104. PMID 8107089.
  3. ^ Eddy SR (2008). Rost, Burkhard (ed.). "A probabilistic model of local sequence alignment that simplifies statistical significance estimation". PLoS Comput Biol. 4 (5): e1000069. doi:10.1371/journal.pcbi.1000069. PMC 2396288. PMID 18516236.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  4. ^ Farrar M (January 2007). "Striped Smith-Waterman speeds database searches six times over other SIMD implementations". Bioinformatics. 23 (2): 156–61. doi:10.1093/bioinformatics/btl582. PMID 17110365.