Molecular evolution of overlapping genes
by Sabath, Niv, Ph.D., UNIVERSITY OF HOUSTON, 2009, 119 pages; 3405062

Abstract:

Overlapping genes are defined as a pair of protein-coding genes whose coding regions overlap on either the same strand or on the opposite strand. The sequence interdependence between two overlapping coding regions adds complexity to almost all molecular evolution analyses. Here, I use a comparative-genomic approach aimed at resolving several open questions concerning the evolution of overlapping genes. I demonstrate that estimates of selection intensity that ignore gene overlap are biased and that the magnitude and the direction of this bias is dependant on the type of overlap. I present a new method for the simultaneous estimation of selection intensities in overlapping genes. I show that overlapping genes are mostly subjected to purifying selection, in contradistinction to previous studies, which ignored the interdependence between overlapping reading frames and detected an inordinate prevalence of positive selection. Using simulation and two case studies, I show that this method can be used to distinguish between spurious and functional overlapping genes by using purifying selection as a tell-tale sign of functionality. In the first study, I test for the functionality of a hypothetical overlapping gene, which is central in the "Rosetta stone" hypothesis for the origin of the two aminoacyl tRNA synthetase classes from a pair of overlapping genes. I found no evidence of selection acting on the hypothetical gene, implying that the gene is non-functional, thus rejecting the "Rosetta stone" hypothesis. In the second study, I search for unannotated overlapping genes in viral genomes. I present evidence for the existence of a novel overlapping gene in the genomes of four viruses that infect Hymenoptera. In another study, I present a method for the detection of selection signatures on hypothetical overlapping genes using population-level data. I apply the method to test whether the hypothetical gene in influenza A is under selection. Finally, I study a previously unexplained difference in the frequencies of overlapping genes of different types. I show that the structure of the genetic code and the abundance of different amino acids in proteins explain this difference between overlap types and lead to a correlation between overlap frequency and genomic composition.

 
Advisor
SchoolUNIVERSITY OF HOUSTON
SourceDAI/B 71-04, p. , May 2010
Source TypeDissertation
SubjectsBiostatistics; Bioinformatics; Virology
Publication Number3405062
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3405062
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.