This dissertation develops methods that efficiently estimate a covariance matrix of longitudinal data without making restrictive parametric assumptions, even when the dimension is large relative to the sample size. The high dimensionality is an important issue because the number of free elements in the covariance matrix increases quadratically with its dimension. Charles Stein pointed out in the 1950's that the unrestricted maximum likelihood estimator of the covariance matrix is statistically inefficient. To gain statistical efficiency and overcome the high dimensionality problem in covariance matrix estimation, an effective regularization scheme is needed. In this dissertation we design such schemes using the regression formulation of covariance matrix estimation advocated by Pourahmadi (1999, 2000).
We propose penalized likelihood methods for producing statistically efficient estimator of a covariance matrix for longitudinal data. The approaches parameterize the covariance matrix through the modified Cholesky decomposition. For longitudinal data, the entries of the lower triangular and the diagonal matrix associated with the modified Cholesky decomposition can be interpreted as regression coefficients and prediction error variances when regressing a measurement on its predecessors. Many covariance matrix estimation methods have been developed based on two observations: first, there is usually some kind of continuity among neighboring elements in the lower triangular; second, the lower triangular is likely to have many off-diagonal elements that are zero or close to zero.
Wu and Pourahmadi (2003) and Huang, Liu, and Liu (2007) proposed to smooth the long subdiagonals of the lower triangular using local polynomial or splines techniques while truncate the short subdiagonals to zeros. Their approaches are rather restrictive and can't be easily combined with shrinkage. Huang et al. (2006) proposed to shrink the elements of the lower triangular by imposing the L1 or L2 penalty using the penalized likelihood, but it completely ignores the possible smoothness in the modified Cholesky factor.
Our first new approach relaxed these restrictions and employs roughness penalty to impose smoothness in the rows or subdiagonals of the lower triangular. Use of roughness has been well studied for nonparametric function estimation and the second-order roughness penalty can be viewed as an approximation of the integrated squared second derivative penalty. This proposed new smoothing method can be easily combined with the shrinkage.
Our second new approach combines smoothing with shrinkage using penalized likelihood. Our method is general enough to include various combinations of the shrinkage penalty (no penalty, the L1 penalty, or the L2 penalty) and the roughness penalty (no smoothing, smoothing along the rows of the lower triangular matrix, or smoothing along the subdiagonals of the lower triangular matrix), while the best combination can be chosen using a data-driven method. It turns out that combination of shrinkage and smoothing does better and sometimes much better than using shrinkage or smoothing alone. Our simulation study shows the superior performance of this new method.
Most of existing works on covariance matrix estimation are only directly applicable to balanced longitudinal data. In this dissertation we also consider the unbalanced case by treating it as a missing data problem and the EM algorithm can be easily applied to our proposed approach.
The practical value of our methods is demonstrated in its application to efficient estimation of regression mean parameters for longitudinal data. It is well known that the regression mean parameters can be consistently estimated even if the covariance structure is misspecified (Liang and Zeger, 1986). However, incorrect specification of the covariance structure can lead to loss of efficiency in estimation of the mean parameters. Our simulation results indicate that with the estimated covariance matrix provided by our methods, the efficiency of the mean parameter estimate is very close to that of the estimate that employs the true covariance structure, which is typically unknown in practice. We also illustrate the proposed method by applying it to real examples. The first one is the well-studied cattle experiment data (Kenward 1987), in which an 11 × 11 covariance matrix is estimated. In the second example, the estimation of a 102 × 102 covariance matrix is used in forecasting call arrival pattern to a telephone call center. Application of the proposed covariance matrix estimator helps improve the forecasting performance. The third example concerns estimation of a 7 × 7 covariance matrix of the attachement loss in diabetic kids in a periodontal research.
Key words: Covariance matrix, longitudinal data, modified Cholesky decomposition, shrinkage estimation, smoothing, cross-validation, GEE, missing data, EM algorithm