High performance image analysis for large histological datasets
by Cooper, Lee, Ph.D., THE OHIO STATE UNIVERSITY, 2009, 250 pages; 3377653

Abstract:

The convergence of emerging challenges in biological research and developments in imaging and computing technologies suggests that image analysis will play an important role in providing a better understanding of biological phenomenon. The ability of imaging to localize molecular information is a key capability in the post-genomic era and will be critical in discovering the roles of genes and the relationships that connect them. The scale of the data in these emerging challenges is daunting; high throughput microscopy can generate hundreds of gigabytes to terabytes of high-resolution imagery even for studies limited in scope to a single gene or interaction. In addition to the scale of the data, the analysis of microscopic image content presents significant problems for the state-of-the-art in image analysis.

This dissertation addresses two significant problems in the analysis of large histological images: reconstruction and tissue segmentation. The proposed methods form a framework that is intended to provide researchers with tools to explore and quantitatively analyze large image datasets.

The works on reconstruction address several problems in the reconstruction of tissue from sequences of serial sections using image registration. A scalable algorithm for nonrigid registration is presented that features a novel method for the matching small nondescript anatomical features using geometric reasoning. Methods for the nonrigid registration of images with different stains are presented for two application scenarios. Correlation sharpness is proposed as a new measure for image similarity, and is used to map tumor suppressor gene expression to structure in mouse mammary tissues. An extended process of geometric reasoning based on the matching of cliques of anatomical features is presented and demonstrated for the nonrigid registration of immunohistochemical stain to hemotoxylin and eosin stain for human cancer images. Finally, a method for the incorporation of structural constraints into the reconstruction process is proposed and demonstrated on the reconstruction of ducts in mammary tissues.

The work on tissue segmentation focuses on the use of statistical geometrical methods to describe the spatial distributions of biologically meaningful elements such as nuclei in tissue. The two point correlation function is demonstrated to be an effective feature for the segmentation of tissues, and is shown to possess a peculiar low-dimensional distribution in feature space that permits unsupervised segmentation by robust methods. The relationship between two-point functions for proximal image regions is derived and used to accelerate computation, resulting in a 7-68x improvement over a naive FFT-based implementation.

In addition to the methods proposed for reconstruction and segmentation, a significant portion of this dissertation is devoted to applying high performance computing to enable the analysis of large datasets. In particular, multi-node parallelization as well as multi-core and general purpose computing on graphics processing are used to form a heterogeneous multiprocessor platform that is used to demonstrate the segmentation and reconstruction methods on images up to 62K × 23K in size.

 
AdvisersBradley Clymer; Kun Huang
SchoolTHE OHIO STATE UNIVERSITY
SourceDAI/B 70-10, p. , Jan 2010
Source TypeDissertation
SubjectsElectrical engineering; Computer science
Publication Number3377653
Adobe PDF Access the complete dissertation:
 

» Find an electronic copy at your library.
  Use the link below to access a full citation record of this graduate work:
  http://gateway.proquest.com/openurl%3furl_ver=Z39.88-2004%26res_dat=xri:pqdiss%26rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation%26rft_dat=xri:pqdiss:3377653
  If your library subscribes to the ProQuest Dissertations & Theses (PQDT) database, you may be entitled to a free electronic version of this graduate work. If not, you will have the option to purchase one, and access a 24 page preview for free (if available).

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.