用户名: 密码: 验证码:
响应变量受限纵向数据中若干统计问题的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
纵向数据在经济学、社会学、生物学以及医学等领域中都有着广泛的应用和研究.然而,在实际应用中变量的测量值常常受到测量仪器或测量机制的限制.例如,响应变量受到某个测量下限的限制时,我们只能观测到大于该下限的响应变量.我们称此类数据为响应变量受限纵向数据.响应变量受限("Tobit")模型是研究响应变量受限纵向数据的一种有效工具.对于该模型,本文研究工作包括模型中的随机加权逼近方法以及参数的组合分位数估计方法.此外,乘积模型在分析非负响应变量数据中也占有重要地位.因此,本文还研究了基于响应变量受限纵向数据的乘积模型中参数的相对误差估计.
     首先,本文考虑纵向数据Tobit模型中参数估计和假设检验的随机加权逼近方法.在纵向数据Tobit模型分析中,统计量的渐近方差一般含有冗余参数,特别是纵向数据的相关结构和误差的密度函数.为了对模型做统计推断,就需要对这些冗余参数进行估计.而事实上,在个体的观测量有限的前提下冗余参数的有效估计很难或无法给出.本文利用随机加权方法建立参数的加权估计量和局部线性假设的加权M检验统计量.在某些常规条件下,我们证明了加权统计量的条件渐近分布来和统计量的渐近分布相同.因此,我们无需估计冗余参数,利用随机加权方法可以直接对参数做统计推断.模拟结果和实际数据分析表明,我们提出随机加权逼近方法是可行的.
     其次,我们提出纵向数据Tobit模型中回归参数的组合分位数估计方法.众所周知,实际应用中如何选择分位点做分位数回归比较困难.因此,本文将组合不同分位数估计方程来建立回归参数(除截距项外)的一个稳健估计.在一定条件下,我们证明了所提组合分位数估计的渐近正态性.模拟结果和实际数据分析表明,一般情况下组合分位数回归方法相较于单个分位数回归方法更加有效.
     最后,本文给出非负响应变量纵向数据的乘积模型,并提出模型中回归参数的最小乘积相对误差(LPRE)估计.在回归分析中,应用较广的估计准则为最小二乘和最小绝对偏差,但在很多实际情况中,如观测的不同变量之间的刻度不同,需要考虑相对误差.因此,本文基于相对误差给出参数的一个LPRE估计.由于LPRE准则函数是光滑凸的,容易证明所提估计量的渐近性质.但参数估计的渐近分布含有一些冗余参数,而这些冗余参数的估计比较困难,特别是纵向数据的相关结构.所以我们使用经验似然思想来构造回归参数的置信区域估计,该方法的优点在于无需估计冗余参数.数值模拟评估了所提方法的估计效果.
Longitudinal data is widely used in many fields such as economics, sociol-ogy, biology and medicine. However, due to accuracies of measurement tools or mechanism, the observations often have some constraints. For example, when measurement of response variable subject to a lower detection limit, only observa-tion with value no less than the limit can be observed. In this paper, this kind of longitudinal data is commonly called longitudinal data with limit dependent vari-able. Limited dependent variable ("Tobit") regression model is one of powerful tools to study the longitudinal data with limit dependent variable. Based on this regression model, we propose a random weighting approximation method and a composite quantile regression estimator of parameter. In addition, multiplicative regression model is useful in analyzing data with positive response. According to relative error criterion, we develop a least product relative error (LPRE) estima-tion for multiplicative regression model with longitudinal data.
     First, this paper proposes random weighting approximation methods to es-timate distributions of parameter estimation statistics in Tobit regression model with longitudinal data. It is well know that the asymptotic distributions of pa-rameter estimation statistics involve some nuisance parameters, such as corre-lation structure and density function of error terms in longitudinal data. But these nuisance parameters are not estimated well, especially with small sample size. To overcome these shortcomings, we employ random weighting method to construct a weighting estimator of parameter and a weighting M-test statistics. Under certain regular conditions, we prove that the conditional limiting distribu-tions of the proposed weighting statistics conditional on the observations are same as the asymptotic distributions of the statistics. Therefore, the distributions of the parameter estimate and testing statistic can be directly estimate by random weighting method without estimating the nuisance parameters. Simulation stud-ies and a real data example show that the proposed random weighting methods are practicable.
     Second, we present a composite quantile estimator of the regression parameter for Tobit regression model with longitudinal data. As well known, it is difficult to decide which quantile should be chosen to estimate regression parameter. Hence, we propose a robust estimator by combining information across a fixed number of quantiles. Under certain conditions, we obtain the asymptotic properties of the proposed composite quantile regression estimator. It follows from simulation studies and a real data analysis that the proposed composite quantile regression estimator works well.
     Finally, a multiplicative regression model with longitudinal data is intro-duced, and a LPRE estimate is constructed based on relative errors. Generally, least squares and least absolute deviation criteria based on absolute errors are the most widely used in the regression analysis. However, when response variables have different measurement scales, relative errors may prefer to absolute errors. Thence, we develop a LPRE estimator of parameter based on relative errors, and get their asymptotic properties where some nuisance parameters such as correla-tion structure of error terms are included. To make inference of LPRE, we employ empirical likelihood technique to construct the confidence region of regression pa-rameters without estimating nuisance parameters. Simulation results confirm that the proposed methods perform well.
引文
[1]Chen, K. N., Ying, Z. L., Zhang, H., Zhao, L. C. Analysis of least absolute deviations. Biometrika 95(1),107-122.2008.
    [2]Chen, K. N., Guo, S. J., Lin, Y. Y., Ying, Z. L. Least absolute relative error estimation. Journal of the American Statistical Association 105,1104-1112,2010.
    [3]Chen, K. N., Lin, Y. Y., Wang, Z. F., Ying, Z. L. Least product relative error estimation.2013. http://arxiv.org/abs/1309.0220
    [4]Chu, H., Moulton, L. H., Mack, W. J., Passaro, D. J., Barroso, P. F., Munoz, A. Correlating two continuous variables subject to detection limits in the context of mixture distributions. Journal of the Royal Statistical Society:Series C 54(5),831-845,2005.
    [5]Diggle, P. J., Heagerty P. J., Liang, K. Y., Zeger, S. L. Analysis of Longitudinal Data. Oxford University Press, USA,2002.
    [6]Donald, H., Robert, D.G. Longitudinal Data Analysis. Wiley-Interscience, New Jersey,2006.
    [7]Gao, S. S., Feng, Z. H., Zhong, Y. M., Shirinzadeh, B. J. Random weighting estimation of param-eters in generalized Gaussian distribution. Information Sciences 178(9),2275-2281,2008.
    [8]Hammer, S. M., Vaida, F., Bennett, K. K., Holohan, M. K., Sheiner, L., Eron, J. J., Wheat, L. J., Mitsuyasu, R. T., Gulick, R. M., Valentine, F. T., Aberg, J. A., Rogers, M. D., Karol, C. N., Saah, A. J.. Lewis, R. H., Bessen, L. J., Brosgart, C., Degruttola, V., Mellors, J. W., AIDS Clinical Trials Group 398 Study Team. Dual vs single protease inhibitor therapy following antiretroviral treatment failure:A randomized trial. The Journal of the American Medical Association 288,169-180,2002.
    [9]Hughes, J. Links mixed effects models with censored data with application to HIV RNA levels. Biometrics 5,625-629,1999.
    [10]Jacqmin-Gadda, H., Thiebaut, R., Chene, G., Commenges, D., Analysis of left censored longitu-dinal data with application to viral load in HIV infection. Biostatistics 1,355-368,2000.
    [11]Kai, B., Li, R. Z., Zou, H. Local composite quantile regression smoothing:an efficient and safe alternative to local polynomial regression. Journal of the Royal Statistical Society, Series B 72, 49-69,2010.
    [12]Keet, I.P., Janssen, M., Veugelers, P. J., Miedema, P., Klein, M. R., Goudsmit, J., Coutinho, R. A., De Wolf, F. Longitudinal analysis of CD4 T cell counts, T cell reactivity, and human immunodeficiency virus type 1 RNA levels in persons remaining AIDS-free despite CD4 cell counts <200 for>5 years. Journal of Infectious Diseases 176,665-671,1997.
    [13]Khoshgoftaar, T. M., Bhattacharyya, B. B., Richardson, G. D. Predicting software errors, during development, using nonlinear regression models:a comparative study. IEEE Transactions on Reliability,41,390-395,1992.
    [14]Koneker, R. Quantile Regression. Cambridge University Press, New York,2005.
    [15]Koenker, R., Bassett, G. Jr. The asymptotic distribution of the least absolute error estimator. Journal of the American Statistical 73,618-622,1978.
    [16]Koenker, R., Geling,O. Reappraising Medfly Longevity:A quantile regression survival analysis. Journal of the American Statistical 96,458-466,2001.
    [17]Koenker, R., Hallock, K. F. Quantile regression. Journal of Economic Perspectives 15(4),143-156, 2001.
    [18]Laird, N. M., Ware, J. H. Random-effects niodels for longitudinal data. Biometrics 38,963-974, 1982.
    [19]Liu, X. H., Wang, Z. F., Wu, Y. H. Distribution approximation of shrinkage estimate in censored regression model via randomly weighting method. Acta Mathematical Applicatae Sinica, in press, 2013.
    [20]Lyles, R. H., Lyles, C. M., Taylor, D. J. Random regression models for human immunodeficiency virus ribonucleic acid data subject to left censoring and informative drop-outs. Journal of the Royal Statistical Society:Series C 49,485-497,2000.
    [21]Molina, J. M., Chene, G., Ferchal, F., Journot, V., Pellegrin, I., Sombardier, M. N., Rancinan, C., Cotte, L., Madelaine, I., Debord, T., Decazes, J. M. for the ALBI (ANRS 070) Study Group. The ALBI trial:a randomized controlled trial comparing Stavudine plus Didanosine with Zidovudine plus Lamivudine and a regimen alternating both combinations in previously untreated patients infected with human immunodeficiency virus. Journal of Infectious Disease 180,351-358,1999.
    [22]Narula, S. C., Wellington, J. F. Prediction, linear regression and the minimum sum of relative errors. Technometrics,19,185-190,1977.
    [23]O'Brien, T. R., Rosenberg, P. S., Yellin, F., Goedert, J. J. Longitudinal HIV-1 RNA levels in a cohort of homosexual men. Journal of Acquired Immune Deficiency Syndromes and Human Retrovirology 18,155-161,1998.
    [24]Owen, A. B. Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, 237-249,1988.
    [25]Owen, A. B. Empirical likelihood ratio confidence regions. Annals of Statistics 18,90-120,1990.
    [26]Owen, A. B. Empirical likelihood. Chapman and Hall, London,1990.
    [27]Paxton, W., Coombs, R., McElrath, M., Keefer, M., Hughes, J., Sinangil, F., Chernoff, D., Demeter, L.. Williams, B., Corey L. Longitudinal analysis of quantitative virologic measures in human immunodeficiency virus-Infected subjects with≥400 CD4 lymphocytes:Implications for applying measurements to individual patients. Journal of Infectious Diseases 175,247-254,1997.
    [28]Pollard, D. Empirical Processes:Theory and Applications, NSF-CBMS Regional Conference Series in Probability and Statistics Vol. Ⅱ. Institute of Mathematical Statistics, Hayward,1990.
    [29]Pollard, D. Asymptotics for least absolute deviations regression estimators. Econometric Theory 7,186-199,1991.
    [30]Powell, J. Least absolute deviations estimates for the censored regression model. Journal of Econo-metrics 25,303-325,1984.
    [31]Qin, J, Lawless, J. F. Empirical likelihood and general estimating equations. Annals of Statistics 22,300-325,1994.
    [32]Rao, C. R., Zhao, L. C. Approximation to the distribution of M-estimates in linear models by randomly weighted bootsrap. Sankhya,1992.
    [33]Rockafellar, R. T. Convex Analysis. Princeton University Press, New Jersey,1970.
    [34]Sun, Y. Q., Wu, H. L. Semiparametric time-varying coefficients regression model for longitudinal data. Scandinavian Journal of Statistics 32,21-47,2005.
    [35]Tang, C. Y., Leng, C. L. Empirical likelihood and quantile regression in longitudinal data analysis. Biometrika 98(4),1001-1006,2011.
    [36]Tang, L. J., Zhou, Z. G., Wu, C. C. Weighted composite quantile estimation and variable selection method for censored regression model. Statistics and Probability Letters 82,653-663,2012.
    [37]Tobin, J. Estimation of relationships for limited dependent variables. Econometrica 26(1),24-36, 1958.
    [38]Wang, H. X., Fygenson, M. Inference for censored quantile regression models in longitudinal studies. The Annals of Statistics 37(2),756-781,2009.
    [39]Wang, H. X., Zhu, Z. Y. Empirical likelihood for quantile regression models with longitudinal data. Journal of Statistical Planning and Inference 141,1603-1615,2011.
    [40]Wang, Z. F., Wu, Y. H., Zhao, L. C. Approximation by randomly weighting method for linear hypothesis testing in censored regression model. Science in China:Series A, Mathematics 52, 561-576,2009.
    [41]Ware, J. H. Linear models for the analysis of serial measurements in longitudinal studies. American Statistician 39,95-101,1985.
    [42]Wu, L. A joint model for nonlinear mixed-effects models with censoring and covariates measured with error, with application to AIDS studies. Journal of the American Statistical Association 97, 955-964,2002.
    [43]Wu, H., Zhang, J. Nonparametric Regression Methods for Longitudinal Data Analysis:Mixed-Effects Modeling Approaches. John Wiley and Sons, New York,2006.
    [44]Wu. X. Y., Yang, Y. N., Zhao, L. C. Approximation by random weighting method for M-test in linear models. Science in China:Series A, Mathematics 50(1),87-99,2007.
    [45]Xia, Y. F., Da, H. Block empirical likelihood for partially inear errors-in-variables models with longitudinal data. International Journal of Pure and Applied Mathematics 87(4),669-680,2013.
    [46]Xue, L. G., Zhu, L. X. L1-norm estimation and random weighting method in a semiparametric model. Acta Mathematicae Applicatae Sinica 21(2),230-295,2005.
    [47]Xue, L. G., Zhu, L. X. Empirical likelihood for a varying coefficient model with longitudinal data. Journal of the American Statistical Association 102,642-654,2007.
    [48]Ye, J. M. Price models and the value relevance of accounting information. Technical report 2007.
    [49]You, J. H., Chen, G. M., Zhou, Y. Block empirical likelihood for longitudinal partially linear regression models. The Canadian Journal of Statistics 34(1),79-96,2006.
    [50]Zhao, L. C. Linear typothesis testing in censored regression models. Statistica Sinica 14,333-347, 2004.
    [51]Zhao, L. C., Fang, Y. X. Random weighting method for censored regression models. Journal of Systems Science and Complexity 17,262-270,2004.
    [52]Zhao, Z. B., Xiao, Z. J. Efficient regressions via optimally combining quantile information. Un-published manuscript 2010. http://www.econ.brown.edu/ecori/events/xiao.pdf
    [53]郑忠国 M 估计与随机加权法.北京大学学报24,277-286,1988.
    [54]Zou, H.. Yuan, M. Composite quantile regression and the oracle model selection theory. Annals of Statistics 36,1108-1126,2008.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700