Efficient Sequential and Parallel Algorithms for Planted Motif Search


Motif searching is an important step in the detection of rare events occurring in a set of DNA or proteinsequences. One formulation of the problem is known as (l, d)-motif search or Planted Motif Search(PMS).

In PMS we are given two integers l and d and n biological sequences. We want to find allsequences of length l that appear in each of the input sequences with at most d mismatches.

The PMSproblem is NP-complete. PMS algorithms are typically evaluated on certain instances consideredchallenging.

Despite ample research in the area, a considerable performance gap exists because manystate of the art algorithms have large runtimes even for moderately challenging instances.

Results:
This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithmto solve the challenging (l, d) instances (25, 10) and (26, 11).

PMS8 is also efficient on instanceswith larger l and d such as (50, 21). We include a comparison of PMS8 with several stateof the art algorithms on multiple problem instances.

This paper also presents necessary and sufficientconditions for 3 l-mers to have a common d-neighbor. The program is freely available athttp://engr.uconn.edu/~man09004/PMS8/.

Conclusions:
We present PMS8, an efficient exact algorithm for Planted Motif Search.

PMS8 introduces novelideas for generating common neighborhoods. We have also implemented a parallel version for thisalgorithm.

PMS8 can solve instances not solved by any previous algorithms.

Author: Marius NicolaeSanguthevar Rajasekaran
Credits/Source: BMC Bioinformatics 2014, 15:34

Published on: 2014-01-31

Tweet

News Provider: EUPB – European Press Bureau

Social Bookmarking
RETWEET This! | Digg this! | Post to del.icio.us | Post to Furl | Add to Netscape | Add to Yahoo! | Rojo

There are no comments available. Be the first to write a comment.