eScholarship Repository eScholarship Repository California Digital Library
eScholarship > Postprints > Paper 1690
Search all papers
 

notify_envelope Notify me of new papers
via Email or RSS


Postprints


Sequence Similarity Search Using Discrete Fourier and Wavelet Transformation Techniques
Alireza A. Aghili, University of California, Santa Barbara
D Agrawal
A El Abbadi

  Download the Article (366 K, PDF file) - 2005 Tell a colleague about it.
Printing Tips: Select 'print as image' in the Acrobat print dialog if you have trouble printing.

ABSTRACT:

In this paper, we study the problem of sequence similarity search. We incorporate vector transformations and apply DFT (Discrete Fourier Transformation) and DWT (Discrete Wavelet Transformation, Haar) dimensionality reduction techniques to reduce the search space/time of sequence similarity range queries. Our empirical results on a number of Prokaryote and Eukaryote DNA contig databases demonstrate up to 50-fold filtration ratio reduction of the search space and up to 13 times faster filtration. The proposed transformation techniques may easily be integrated as a pre-processing phase on top of current similarity search heuristics/techniques such as BLAST, PatternHunter, FastA and QUASAR to efficiently prune non-relevant sequences. We study the precision of applying dimensionality reduction techniques for faster and more efficient range query searches and discuss the imposed trade-offs.

SUGGESTED CITATION:
Alireza A. Aghili, D Agrawal, and A El Abbadi, "Sequence Similarity Search Using Discrete Fourier and Wavelet Transformation Techniques" (2005). International Journal On Artificial Intelligence Tools. 14 (5), pp. 733-754. Postprint available free at: http://repositories.cdlib.org/postprints/1690

 
bar
Open Archives Initiative eScholarship is a service of the California Digital Library bepress