GHMM and HMMEd: A toolkit for Hidden Markov Models

Wasinee Rungsarityotin1, Alexander Schliep2
1rungsari@molgen.mpg.de, Max Planck Institute for Molecular Genetics; 2schliep@molgen.mpg.de, Max Planck Institute for Molecular Genetics

Applications of HMMs require an appropriate model topology -- the number of states, the allowed and forbidden transitions -- whose parameters are trained from data to produce the final model. This process requires manual assistance and good knowledge of the problem domain. We have implemented a library for a general Hidden Markov Model (HMMEd and GHMM) to assist in designing a topology and visualizing parameters of a HMM. We have used the tool in solving problems such as identification of circular permutation with Profile HMMs or studying of the effect of gap costs on scoring HMMs. Once parameters are set up, the model can be saved as an XML file which can be used later for training, scoring or generating sequences. The probability of new sequences is evaluated with the Forward algorithm and Viterbi for recovering the best path through state space. GHMM and HMMEd can be accessed freely at http://sourceforge.net/projects/ghmm/ and Gato at http://algorithmics.molgen.mpg.de/gato.html.