Motif searching is an important step in the detection of rare events occurring in a set of DNA or proteinsequences. One formulation of the problem is known as (l, d)-motif search or Planted Motif Search(PMS).
In PMS we are given two integers l and d and n biological sequences. We want to find allsequences of length l that appear in each of the input sequences with at most d mismatches.
The PMSproblem is NP-complete. PMS algorithms are typically evaluated on certain instances consideredchallenging.
Despite ample research in the area, a considerable performance gap exists because manystate of the art algorithms have large runtimes even for moderately challenging instances.
Results:
This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithmto solve the challenging (l, d) instances (25, 10) and (26, 11).
PMS8 is also efficient on instanceswith larger l and d such as (50, 21). We include a comparison of PMS8 with several stateof the art algorithms on multiple problem instances.
This paper also presents necessary and sufficientconditions for 3 l-mers to have a common d-neighbor. The program is freely available athttp://engr.uconn.edu/~man09004/PMS8/.
Conclusions:
We present PMS8, an efficient exact algorithm for Planted Motif Search.
PMS8 introduces novelideas for generating common neighborhoods. We have also implemented a parallel version for thisalgorithm.
PMS8 can solve instances not solved by any previous algorithms.
Author: Marius NicolaeSanguthevar Rajasekaran
Credits/Source: BMC Bioinformatics 2014, 15:34
Published on: 2014-01-31
Tweet
News Provider: EUPB – European Press Bureau
Social Bookmarking
RETWEET This! | Digg this! | Post to del.icio.us | Post to Furl | Add to Netscape | Add to Yahoo! | Rojo
There are no comments available. Be the first to write a comment.