eScholarship Repository eScholarship Repository California Digital Library
eScholarship > UCLASTAT > PAPERS > Paper 2007010103

Statistics Papers

Statistics Website

Policies

Search Statistics

Submit a Paper

Notify me of new papers

institute_logo

Department of Statistics, UCLA
University of California, Los Angeles

Statistics Papers  •  Statistics Website  •  Policies  •  Search Statistics  •  Submit a Paper

Coupling Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species
Qing Zhou, UCLA Department of Statistics
Wing H. Wong, Stanford

Download the Paper (798 K, PDF file) - January 1, 2007 Tell a colleague about it.
Printing Tips: Select 'print as image' in the Acrobat print dialog if you have trouble printing.

ABSTRACT:
Cis-regulatory modules (CRMs) composed of multiple transcription factor binding sites (TFBS's) control gene expression in eukaryotic genomes. Comparative genomic studies have shown that these regulatory elements are more conserved across species due to evolutionary constraints. We propose a statistical method to combine module structure and cross-species orthology in de novo motif discovery. We use a hidden Markov model (HMM) to capture the module structure in each species and couple these HMMs through multiple-species alignment. Evolutionary models are incorporated to consider correlated structures among aligned sequence positions across different species. Based on our model, we develop a Markov chain Monte Carlo approach, MultiModule, to discover CRMs and their component motifs simultaneously in groups of orthologous sequences from multiple species. Our method is tested on both simulated and biological data sets in mammals and Drosophila, where significant improvement over other motif and module discovery methods is observed.

SUGGESTED CITATION:
Qing Zhou and Wing H. Wong, "Coupling Hidden Markov Models for the Discovery of Cis-Regulatory Modules in Multiple Species" (January 1, 2007). Department of Statistics, UCLA. Department of Statistics Papers. Paper 2007010103.
http://repositories.cdlib.org/uclastat/papers/2007010103

 
bar
Open Archives Initiative eScholarship is a service of the California Digital Library bepress