Mathematical optimization and algorithmic development for protein structure prediction
by McAllister, Scott Ryan, Ph.D., PRINCETON UNIVERSITY, 2008, 460 pages; 3299838

Abstract:

The protein folding problem represents one of the most challenging and potentially rewarding problems in computational biology. This problem can be posed as, "Given a primary amino acid sequence of a protein, how does this protein fold into its active three-dimensional structure?" One approach to solving this problem results from the alignment of an unknown protein structure to a homologous protein with an experimentally-determined structure. Two novel mixed-integer linear programming models and an integer linear programming model have been developed to rigorously address the global pairwise sequence alignment problem. The important components of these model formulations are (i) conservation constraints, (ii) a rank-ordered list of alignments, and (iii) pairwise interaction scores with optimality guarantees.

The prediction of contacts between residues within a protein is useful to reduce the conformational space that must be searched by structure prediction algorithms. Three modeling contributions in this thesis work have addressed the contact prediction problem for (i) globular, α-helical bundle proteins, (ii) membrane, α-helical bundle proteins, and (iii) proteins with α/β structure. All three of these problems are addressed using mixed-integer linear programming techniques and are validated on a variety of test proteins. The development of these low distance contacts can provide additional distance restraints for first principles approaches to the tertiary structure prediction problem for both globular and membrane proteins.

The reduction of the conformational space of a protein was further explored in an investigation of general dihedral angle and distance bounding strategies. The protein tertiary structure prediction problem is then formulated as the minimization of an atomistic-level force field subject to constraints from contact predictions and other general bounding strategies. This problem was addressed by the development of a hybrid optimization algorithm that combines (i) the αBB deterministic optimization approach, (ii) the conformational space annealing algorithm, (iii) torsion angle dynamics methods, (iv) rotamer optimization algorithms, (v) sequential quadratic programming methods, and (vi) a parallel implementation. This hybrid algorithm was tested and validated with (i) test proteins from the literature, (ii) α-helical bundle proteins with contact predictions, (iii) blind protein structure predictions, and (iv) NMR structure prediction and refinement examples.

 
Advisor
SchoolPRINCETON UNIVERSITY
SourceDAI/B 69-01, p. , Apr 2008
Source TypeDissertation
SubjectsChemical engineering; Bioinformatics; Biophysics
Publication Number3299838
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3299838
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.