Computational studies of proteins and nucleic acids: On pKa calculations in RNA and the use of structure to improve sequence alignments
by Tang, Christopher Lung Kong, Ph.D., COLUMBIA UNIVERSITY, 2008, 127 pages; 3317658

Abstract:

Sequence profiles describe positions in a protein by the frequency they can be replaced by each of the twenty amino acids. This thesis describes the creation of sequence profiles based on structural alignments called HMAP (hybrid multi-dimensional alignment of profiles). In particular, they are called "hybrid" because they combine and use several kinds of information, including amino acid preferences, and actual or predicted secondary structure. Profiles can be merged based on alignments derived from the structural superposition of related templates. In this thesis, these profiles are tested for their ability to detect structurally-related proteins in comprehensive benchmarks based on: (1) the protein classification database: SCOP, and (2) a quantitative measure of protein structural distance (PSD) to define true structural relationships. We find that the use of actual and predicted secondary structure greatly improves fold recognition and alignment quality, and the level of alignment accuracy achieved compares favorably to some of the top-performing methods in blind structure prediction experiments such as CASP. Profiles merged based on structural alignments also perform well, suggesting that weak sequence signals may be captured by these profiles.

Methods based on the Poisson-Boltzmann equation have been quite successful in calculating amino acid pKa shifts in proteins. Solutions of the Poisson-Boltzmann equation describe the electrostatic potential around charged and polar molecules immersed in an aqueous solvent with mobile counterions. The focus of the research presented here was to establish whether methods based on this equation could be successfully applied to RNA molecules. In particular, we started with MCCE, a method for calculating pKa shifts in proteins, and added the capability to apply it to RNA molecules. This work has required addressing several issues: (1) the nonlinear behavior of electrostatic potentials in the presence of highly charged molecules, for example in RNA, has not usually been treated in protein pKa calculations, (2) the lack of atomic charge parameters for protonated nucleotides has required their development and testing, and (3) a set of experimental results from the literature was needed to test the validity of the calculations. The work presented in this thesis now shows that the method: (1) can reproduce pK a shifts that have been measured experimentally, though some appear to be overestimated, (2) reproduces changes in the pKa shifts due to changes in salt concentration, (3) predicts the locations of nucleotides that may be functionally important due to their unusual pKa shifts, and (4) can help explain how structural features contribute to the pK a shifts of nucleotides in ribozymes and other RNAs. Applied to several RNAs including the HDV and hairpin ribozymes, the results support the idea that RNA structure can stabilize protonated nucleotides where they may be critical for structure or function. The structural features that help stabilize these shifts are also discussed. (Abstract shortened by UMI.)

 
AdviserBarry Honig
SchoolCOLUMBIA UNIVERSITY
SourceDAI/B 69-05, p. , Aug 2008
Source TypeDissertation
SubjectsBioinformatics
Publication Number3317658
Adobe PDF Access the complete dissertation:
 

» This is an open access dissertation.
  Use the link below to access the full text PDF of this graduate work:
  http://gradworks.umi.com/3317658.pdf
  Use the link below to search and retrieve all open access dissertations:
  http://pqdtopen.proquest.com

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.