Böhm, Hilmar
[UCL]
In spectral analysis of high dimensional multivariate time series, it is crucial to obtain an estimate of the spectrum that is both numerically well conditioned and precise. The conventional approach is to construct a nonparametric estimator by smoothing locally over the periodogram matrices at neighboring Fourier frequencies. Despite being consistent and asymptotically unbiased, these estimators are often ill-conditioned. This is because a kernel smoothed periodogram is a weighted sum over the local neighborhood of periodogram matrices, which are each of rank one. When treating high dimensional time series, the result is a bad ratio between the smoothing span, which is the effective local sample size of the estimator, and dimension.
In classification, clustering and discrimination, and in the analysis of non-stationary time series, this is a severe problem, because inverting an estimate of the spectrum is unavoidable in these contexts. Areas of application like neuropsychology, seismology and econometrics are affected by this theoretical problem.
We propose a new class of nonparametric estimators that have the appealing properties of simultaneously having smaller L2-risk than the smoothed periodogram and being numerically more stable due to a smaller condition number. These estimators are obtained as convex combinations of the averaged periodogram and a shrinkage target. The choice of shrinkage target depends on the availability of prior knowledge on the cross dimensional structure of the data. In the absence of any information, we show that a multiple of the identity matrix is the best choice. By shrinking towards identity, we trade the asymptotic unbiasedness of the averaged periodogram for a smaller mean-squared error. Moreover, the eigenvalues of this shrinkage estimator are closer to the eigenvalues of the real spectrum, rendering it numerically more stable and thus more appropriate for use in classification. These results are derived under a rigorous general asymptotic framework that allows for the dimension p to grow with the length of the time series T. Under this framework, the averaged periodogram even ceases to be consistent and has asymptotically almost surely higher L2-risk than our shrinkage estimator.
Moreover, we show that it is possible to incorporate background knowledge on the cross dimensional structure of the data in the shrinkage targets. We derive an exemplary instance of a custom-tailored shrinkage target in the form of a one factor model. This offers a new answer to problems of model choice: instead of relying on information criteria such as AIC or BIC for choosing the order of a model, the minimum order model can be used as a shrinkage target and combined with a non-parametric estimator of the spectrum, in our case the averaged periodogram.
Comprehensive Monte Carlo studies we perform show the overwhelming gain in terms of L2-risk of our shrinkage estimators, even for very small sample size. We also give an overview of regularization techniques that have been designed for iid data, such as ridge regression or sparse pca, and show the interconnections between them.


Bibliographic reference |
Böhm, Hilmar. Shrinkage methods for multivariate spectral analysis. Prom. : von Sachs, Rainer |
Permanent URL |
http://hdl.handle.net/2078.1/6226 |