|
Abstract:
Two essays explore problems in which delay modeling is important. I develop Edgeworth expansions for total delays assumed to be sums of exponentially distributed random variables. These constituents of total delay may be non-identically-distributed and correlated. I apply the expansions to two problems: total delay in a data path and mean time to deletion from a financial index. For both problems, my expansions more accurately characterize the distribution of the sum/average than typical normal-based Edgeworth expansions. I then examine the problem of classifying stock trades as buyer- or seller-initiated. I propose estimated quotes for midpoint and bid/ask tests and a modeling approach to classification. Prevailing quotes are estimated using approximations to quote delay distributions; the approximations are based on expansions developed in the first essay. Classification is done by a generalized linear model which includes improved versions of midpoint, tick, and bid/ask tests. The model also considers the strengths of these tests, can account for market microstructure peculiarities, and allows for autocorrelations and cross-correlations in trade direction. The correlation modeling corrects for pseudoreplication, yielding more accurate standard errors and fixed effect estimates. Further, the model estimates probabilities of correct classification. The model is compared to various trade classification methods using a sample of 2,836 domestic US stocks from an unexplored, recent, and readily-available dataset. Out of sample, modeled classifications are 1-2% more accurate overall than current methods; this improvement is consistent across dates, sectors, and locations relative to the inside quote. For Nasdaq and NYSE stocks, 1% and 1.3% of the improvement comes from using relative strengths of the various tests; 0.9% and 0.7% of the improvement, respectively, comes from using some form of estimated quotes. For AMEX stocks, a 0.4% improvement is attributed to using a lagged version of the bid/ask test. I also find indications of short- and ultra-short-term alpha. Finally, my work hints at reviving and extending a subset of nearly-forgotten time series models. Randomly-delayed autoregressive (RaDAR) models form a subclass of distributed lag models.
|