摘 要 在有限總體中提出了一類基于廣義Liu估計的新的預測,得到了基于廣義Liu估計的預測在預測均方誤差意義下優于最優線性無偏預測的充要條件,并通過實例對理論成果進行了進一步的說明.
關鍵詞 有限總體;預測均方誤差;最優線性無偏預測;廣義Liu 估計
中圖分類號 O 212.4 文獻標識碼 A
A Predictor Based on the Generalized Liu Estimator
in Finite Populations and Its Smallsample Properties
HUAMG Jiewu1,2
(1.College of Science, Guizhou University for Nationalities, Guizhou, Guiyang 550025;
2.College of Mathematics and Statistics,Chongqing University,Chongqing 401331)
Abstract This paper proposed a new predictor based on the generalized Liu estimator in finite populations.The necessary and sufficient conditions for the superiority of the predictor based on the generalized Liu estimator over the best linear unbiased predictor in the prediction mean squared error sense were derived. Furthermore, a numerical example was given to illustrate some of the theoretical results.
Key words finite populations; prediction mean squared error; best linear unbiased predictor, generalized Liu estimator
1 Introduction
The problem of prediction in finite populations received considerable attention in the past several decades.As we can see, most of works in the existing literatures are in the context of linear unbiased prediction, such as Hendserson[1], Perieira and Kodrigues[2], Bolfarine et al.[3], Yu and He[4] and Rueda et al.[5]. However,the prediction mean squared error of the best linear unbiased predictor(BLUP) will become inflated when there exists multicollinearity among the explanatory variables and the results of regression will be often unacceptable.In this case, some prediction methods have been developed to improve the best linear unbiased predictor, one of these methods is the biased prediction method. Hoerl and Kennard[6]proposed the socalled ridge regression method which isunaffected by multicollinearity among the many independentvariables. Wang[7] proposed adaptive ridgetype predictors in finite population.
Our primary aim in this paper is to introduce a new predictor based on the generalized Liu estimator[8] to combat the multicollinearity. The proposed predictor is biased, and we discuss its superiority over the best linear unbiased predictor in the prediction mean squared sense in some detail.
The remainder of this paper is organized as follows. In Section 2, we introduce the model specifications, some corresponding definitions and lemmas.Some properties of the predictor based on the generalized Liu estimator are discussed in Section 3. Then, a numerical example is provided to illustrate some of the theoretical results in Section 4 .
2 Preliminaries
Let Ω={1,2,…,N}be the set of labels of the units of a finite population of size N, where N is known. Associated with the ith unit of Ω, there are p+1 quantities:
yi,xi1,…xip,i=1,2,…,N, where all but yi are known. Denote y=(y1,y2,…yN)′and X=(X1,X2,…,XN)′, where Xi=(xi1,xi2,…,xiN)′,i=1,2,…,N.
We consider a linear model from Ω denoted by
y=Xβ+e,E(e)=0,cov (e)=σ2V。 (1)
where e is an Ndimensional random vector of error variables, V is a known positive matrix, while β is a p×1 vector of unknown parameters, E(·)and cov (·)denote the expectation and variance of random vector.
Consider linear model (1), suppose that a sample of size s is selected from Ω by using some specified sampling design,r=N-s is the rest labels which are not in the sample. The objective is to predict values of linear function θ=l′y from a sample,where lis a known vector. If we take l=(1,…,1)′,then θ=l′y is the population total. Listing without loss of generality the sampled units first, we may partition ony,X e and V as follows:
經 濟 數 學第 29卷第1期黃介武:有限總體中基于廣義Liu 估計的預測及其小樣本性質
y=ysyr,X=XsXr
e=eser,V=VsVsrVrsVr.
Definition 1 Let θ be a linear predictor of θ=l′y, θ is said to be the best linear unbiased predictor (BLUP) under the finite populations defined by (1) if
(i)θ is unbiased, i.e.E(θ*-θ)=0
(ii) E(θ-θ)2≤E(-θ)2 for any linear unbiased predictor of θ=l′y.
Lemma 1 Under the finite populations defined by (1), let V >0,V rs=V sr=0, then the BLUP of θ=l′y is given by
U=l′sys+l′rXrOLE,(2)
where OLE=(X′sV-1sXs)-1X′sV-1sys is the ordinary least squares estimator.
Definition 2 Under the finite populations defined by eq.(1), let V>0, Vrs=Vsr=0, the predictor based on the generalized Liu estimator D of θ=l′y is defined as
D=l′sys+l′rXrLE, (3)
where LE=(X′sV-1sXs+I)-1(X′sV-1sXs+D)OLE is the generalized Liu estimator ofβ with D=diag(d1,d2,…,ds),0<di<1,i=1,2,…,s,being the Liu parameters [ 9].
When D=diag(d,d,…,d),0<d<1,we denote d=D which is called the predictor based on Liu estimator.
In order to compare the predictors, we provide with a notion of prediction mean squared error (PMSE).
Definition 3 Under the finite populations defined by eq.(1), let be a predictor of θ, the prediction mean squared error of is defined as
PMSE()=E(-θ)′(-θ) .(4)
Lemma 2 Under the finite populations defined by eq.(1), for any predictor of θ=l′y, we have
PMSE()=l′rXrMMSE()X′rlr+σ2l′rVrlr .See [7].
Lemma 3 Let M be a positive definite matrix, namely M>0, α be some vector, then M-αα′≥0 if and only if α′M-1α≤1. See [10].
3 Properties of the predictor based on the generalized Liu estimator
Theorem 1 Under the finite populations defined by eq.(1), let V>0,Vrs=Vsr=0, the predictor based on generalized Liu estimator D of θ=l′y is biased.
Proof: ‖E(D)-θ‖=
‖l′sXs+l′rXrE(LE)-l′sXs-l′rXr‖
=‖l′rXr‖×||β‖×
‖(X′sV-1sXs+I)-1(X′sV-1sXs+D)-I|
=‖l′rXr‖×‖β‖×||(D-I)(Λ+I)-1‖,
where
Λ=diag(λ1,…,λs)=Q(X′sV-1sXs)Q′,
λi>0,1=1,…,s,Q is an orthogonal matrix.
Noted that 0<di<1,i=1,2,…,s, these imply that ‖E(D)-θ‖≠0, that is, D is a biased predictor of θ=l′y.The proof is completed.
From Lemma 2, we have
PMSE(D)=l′rXrMMSE(D)X′rlr+σ2l′rVrlr,
PMSE(U)=l′rXrMMSE(OLE)X′rlr+σ2l′rVrlr.
In order to compare D with U in the PMSE sense, we investigate the difference
Δ=PMSE(D)-PMSE(U)
=l′rXr(MMSE(D)-MMSE(U))X′rlr.
Noted that
MMSE(D)-MMSE(U)
=σ2FDSFD-σ2S-1+(FD-I)ββ′(FD-I)
=(S+I)-1σ2(S+D)S-1(S+D)+
(D-I)ββ′(D-I)(S+I)-1-
σ2(S+I)-1(S+I)S-1(S+I)(S+I)-1,
where S=X′sV-1sXs,FD=(S+I)-1(S+D).
Thus Δ≤0 if and only if
S-1-DS-1D-2(D-I)
≥1σ2(D-I)ββ′(D-I).
Applying Lemma 3, we have Δ≤0 if
and only if
β′(D-I)-1S-1-DS-1D-2(D-I)×
(D-I)-1β≤σ2.
So, we may state the following theorem:
Theorem 2 The predictor based on generalized Liu estimator D is superior to the best linear unbiased predictor U in the PMSE sense, if and only if
β′(D-I)-1S-1-DS-1D-2(D-I)×
(D-I)-1β≤σ2. (5)
When
D=diag(d,d,…,d),0<d<1,
educes to the necessary and sufficient condition for superiority of the predictor based on Liu estimator d over the best linear unbiased predictor U in the PMSE sense, given as follow:
β′1+d1-dS-1+21-dI-1β≤σ2(6)
4 Numerical example
To illustrate our theoretical results we consider the dataset on Portland cement ever considered by Zhong and Yang [1]. In this article, we use the same data, try to illustrate that the proposed predictor is superior to the best linear unbiased predictor under certain conditions. Our computations were performed by using Matlab. We assemble our data as follows:
The four columns of the matrix X comprise the data on x1,x2,x3 and x4 respectively.
Firstly, we can obtain the eigenvalues of X′Xas λ1=446 76.21,λ2=5 965.42,
λ3=809.95,λ4=105.42,λ5=0.001 23,
and the condition number is approximately 3.66793e+007 .So the design matrix X is quite illconditioned.
Then, partitioning the data, let s=11,r=2,
y=ysyr,X=XsXr and θ=1′13y,where 1′13=(1,1,1,1,1,1,1,1,1,1,1,1,1).In this case,
we consider the estimated PMSE values of d and U and their corresponding estimated values of σ2 and
β′1+d1-dS-1+21-dI-1β .
which are denoted by 2 and respectively. The results are showed in Table 1.
Table1 The values of ,2,PMSE()and
PMSE(d)for θ=1240.5
d
We can see from Table 1 that when is smaller than 2 which implies that condition (6) is satisfied, we have
PMSE(d)<PMSE(OLE), which illustrates the conclusion of Theorem 2
References
[1] C R HENDSERSON. Best linear unbiased estimation and prediction under a selection model [J].Biometrics,1975, 31(2): 423-447.
[2] C A B PEREIRA, J RODRIGUES. Robust linear prediction in finite populations [J]. Internat Statist Rev,1983,51(1):293-300.
[3] H BOLFARINE, S ZACKS,S N ELIAN. Optimal prediction of the finite population regression coefficient[J].Sankhya Ser B, 1994, 56(1): 1-10.
[4] S H YU, C Z HE. Optimal prediction in finite populations[J]. Appl.Math.J. Chinese Univ Ser A,2000,15(2): 199-205.
[5] M RUEDA, I R S BORREGO. A predictiveestimator of finite population mean using nonparametric regression [J]. Comput Stat,2009,24(1):1-14.
[6] A E HEORL, R W KENNARD. Ridge regression: biased estimation for nonorthogonal problems [J].Technimetrics, 1970, 12(1): 55-67.
[7] S G WANG.Adaptive ridgetype predictors in the finite population [J].Chinese Science Bul1, 1990, 35(11): 804-806.
[8] F AKDENIZ, S KACIRANLAR.On the almost unbiased generalized Liu estimator and unbiased estimation of the Bias and MSE[J].Commun. Statist. Theor. Meth ,1995,24(7):1789-1797.
[9] Z ZHONG,H YANG.Ridge estimation to the restricted linear model [J]. Commun Statist Theor Meth,2007,36:2099-2115
[10]R W FAREBROTHER. Further results on the mean square error of ridge regression[J].J R Stat Soc Ser B1976,38(3):248-250.