Systematic characterization of cis-regulation in C.elegans using evolutionary conservation
by Cheng, Donavan, Ph.D., THE JOHNS HOPKINS UNIVERSITY, 2009, 172 pages; 3395668

Abstract:

Comparative genomics approaches for cis-regulatory element detection typically rely on sequence alignment, even though recent studies show modest overlap (∼50%) between confirmed regulatory elements and regions of high sequence alignability. This dissertation focuses on developing alignment-independent approaches for detecting conserved cis-regulatory elements and modules and is organized in three parts: In the first study, we present Flipper, a novel alignment-independent Gibbs sampling based algorithm which uses over-representation and evolutionary conservation equally to detect conserved DNA regulatory elements ab initio from orthologous sequence. Flipper performs up to 23% better than existing methods at recovering seeded motifs from synthetic test data and also recovers more known motifs from yeast, worm and fly ChIP-chip data. To discover novel regulatory motifs, we ran Flipper on promoters of sets of coexpressed genes in C.elegans. We focused on the ribosomal protein (RP) gene cluster, as it is highly coexpressed but yet little is known about its regulation. Flipper detected 22 motifs associated with the RP promoters, where four motifs (M546, M313, M540 and M439) were significantly conserved and specific to the RP gene cluster in C.elegans and its relatives C.remanei, C.briggsae, and C.brenneri. In our second study, we used a promoter::mCherry transcriptional reporter assay to test our predicted motifs for function. M546 severely abrogated mCherry expression when mutated in 8 out of 11 tested promoters and similarly, M313 was necessary for promoter function in 4 of 9 cases, M540 in 3 of 7 cases and M439 in 1 of 3 cases respectively. In a promoter "transplant" experiment, we demonstrated that M546 and M540 are functionally conserved and are necessary for C.briggsae promoters to drive mCherry expression in C.elegans . M546 and M540 occur in a large number of non-ribosomal promoters and we show that M546 is also necessary for function in the mcm-7 promoter, even though its expression profile is markedly different from RPs. In the third study, we demonstrate that rules governing the organization of cis-regulatory elements in modules, in terms of relative spacing, positioning and orientation constraints, can also be conserved across species. Using this information, we discover a strong, conserved spacing and orientation bias in pairs of co-occurring M546 and M540 sites in RP promoters. Using a "sequence swap" experiment, we disrupted the spacing between M546 and M540 sites and showed that it has a severe effect on rps-7 promoter function. We show that a large number of non-ribosomal promoters contain M546 and M540 sites because these sites reside in an arm of the CELE2 transposon, which happened to insert itself in these promoters. Interestingly, the M546-M540 pair in these promoters do not obey the RP spacing constraint and these promoters are not enriched in any common GO annotations, while other non-ribosomal promoters containing M546-M540 sites with the RP spacing constraint are strongly enriched for growth and development GO annotations (p < 10 -9), which are consistent with the need for RP biogenesis. In summary, using an alignment independent approach, we have identified conserved cis-regulatory elements necessary for RP gene expression in C.elegans, with the M546 and M540 motifs possibly part of a regulatory module that is involved in more general regulation of growth and early development processes.

 
AdviserMichael A. Beer
SchoolTHE JOHNS HOPKINS UNIVERSITY
SourceDAI/B 71-01, p. , Apr 2010
Source TypeDissertation
SubjectsBiostatistics; Genetics; Bioinformatics
Publication Number3395668
Adobe PDF Access the complete dissertation:
 

» This is an open access dissertation.
  Use the link below to access the full text PDF of this graduate work:
  http://gradworks.umi.com/3395668.pdf
  Use the link below to search and retrieve all open access dissertations:
  http://pqdtopen.proquest.com

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.