Computational annotation of eukaryotic gene structures: Algorithms development and software systems

by Sparks, Michael Edward, Ph.D., IOWA STATE UNIVERSITY, 2007, 137 pages; 3289422


An important foundation for the advancement of both basic and applied biological science is correct annotation of protein-coding gene repertoires in model organisms. Accurate automated annotation of eukaryotic gene structures remains a challenging, open-ended and critical problem for modern computational biology.

The use of extrinsic (homology) information has been shown as a quite successful strategy for this task, though it is not a perfect solution, for a variety of reasons. More recently, gene prediction methods leveraging information present in syntenic genomic sequences have become favorable, though these too, have limitations.

Identifying genes by inspection of genomic sequence alone thoroughly tests our theoretical understanding of the gene recognition process as it occurs in vivo, and where we encounter failure, excellent opportunities for meaningful research are revealed.

Therefore, the continued development of methods not reliant on homology information—the so-called ab initio gene prediction methods—should help to more rapidly achieve a comprehensive understanding of gene content in our model organisms, at least.

This thesis explores the development of novel algorithms in an attempt to advance the current state-of-the-art in gene prediction, with particular emphasis on ab initio approaches.

The work has been conducted with an eye towards contributing open source, well-documented, and extensible software systems implementing the methods, and to generate novel biological knowledge with respect to plant taxa, in particular.

AdvisersVolker Brendel; Jonathan F. Wendel
Source TypeDissertation
Publication Number3289422

About ProQuest Dissertations & Theses
With nearly 4 million records, the ProQuest Dissertations & Theses (PQDT) Global database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

PQDT Global combines content from a range of the world's premier universities - from the Ivy League to the Russell Group. Of the nearly 4 million graduate works included in the database, ProQuest offers more than 2.5 million in full text formats. Of those, over 1.7 million are available in PDF format. More than 90,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - - or contact ProQuest Support.