Knowledge-based modeling of RNA 3D structure
by Jonikas, Magdalena Anna, Ph.D., STANFORD UNIVERSITY, 2009, 147 pages; 3382755

Abstract:

RNA's unique ability to act as both a messenger of genetic information (mRNA) and carry out complex chemical reactions in the cell distinguishes it from other biological polymers. By adopting complex three-dimensional (3D) structures, RNA molecules are able to perform functions including catalysis and regulation of transcription and translation. Understanding the functions of these molecules depends critically on knowing their structure. However, creating 3D structural models of RNA remains a significant challenge. In this work, I present two methods that apply an informatics approach to this problem, along with applications of these methods to RNA molecules.

The first method, the Nucleic Acid Simulation Tool (NAST), builds low-resolution 3D structure predictions from limited information about the molecule. NAST uses a one-point-per-residue simplification of RNA structure and statistics of geometries observed in known RNA 3D structures to reduce the computational complexity of the problem and generate a large ensemble of solutions in a reasonable amount of time. Starting with the primary sequence and secondary structure prediction, we use this automated tool to build coarse grain models of several RNA molecules and compare them to their known crystal structures. We also use NAST to combine information from different sources, such as partial crystal structures and hand-made models built by modeling experts.

The second method, Coarse to Atomic (C2A), uses geometric information in known RNA 3D structures to add full atomic resolution to coarse grain structural models, such as the ones generated by NAST. We use the existing physic-based molecular dynamics engine GROMACS to minimize the full atomic structures and ensure their chemical reasonability. The resulting full atomic structures remain within 1Å of the coarse grain template used as input and can be used as starting structures for physics-based modeling and dynamics methods.

In this dissertation, we demonstrate several applications of these methods. First, we modeled RNA folding intermediates and pathways. RNA molecules form their 3D structures through complex folding pathways that include intermediate conformations, some of which last long enough to be considered trapped states. We used NAST to model RNA folding pathways and intermediates, including such "trapped" states.

By pipelining NAST and C2A, we have created an automated system for knowledge-based prediction of full atomic 3D structures. This method requires no modeling experience from the user and only limited information about the molecule, making possible the large-scale prediction of all RNA molecules with predicted secondary structure. We applied this pipeline first to 95 RNA molecules with known structures by constraining only the secondary structure. Finally, we applied the pipeline to size predicted secondary structure for three experimentally observed functional states of a single aptamer primary sequence and observed the effects of the various secondary structures on the possible 3D structures.

 
Advisor
SchoolSTANFORD UNIVERSITY
SourceDAI/B 70-10, p. , Dec 2009
Source TypeDissertation
SubjectsBiomedical engineering; Bioinformatics
Publication Number3382755
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3382755
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.