Metrika [SJR: 0.943] [H-I: 25] [2 followers] Follow Hybrid journal (It can contain Open Access articles) ISSN (Print) 1435-926X - ISSN (Online) 0026-1335 Published by Springer-Verlag [2281 journals] |
- A von Mises approximation to the small sample distribution of the trimmed
mean- Abstract: Abstract
The small sample distribution of the trimmed mean is usually approximated by a Student’s t distribution. But this approximation is valid only when the observations come from a standard normal model and the sample size is not very small. Moreover, until now, there is only empirical justification for this approximation but no formal proof. Although there are some accurate saddlepoint approximations when the sample size is small and the distribution not normal, these are very difficult to apply and the elements involved in it, difficult to interpret. In this paper we propose a new approximation based on the von Mises expansion for the tail probability functional of the trimmed mean, which improves the usual Student’s t approximation in the normal case and which can be applied for other models. This new approximation allows, for instance, an objective choice of the trimming fraction in a context of hypothesis testing problem using the new tool of the p value line.
PubDate: 2016-05-01
- Abstract: Abstract
The small sample distribution of the trimmed mean is usually approximated by a Student’s t distribution. But this approximation is valid only when the observations come from a standard normal model and the sample size is not very small. Moreover, until now, there is only empirical justification for this approximation but no formal proof. Although there are some accurate saddlepoint approximations when the sample size is small and the distribution not normal, these are very difficult to apply and the elements involved in it, difficult to interpret. In this paper we propose a new approximation based on the von Mises expansion for the tail probability functional of the trimmed mean, which improves the usual Student’s t approximation in the normal case and which can be applied for other models. This new approximation allows, for instance, an objective choice of the trimming fraction in a context of hypothesis testing problem using the new tool of the p value line.
- Fourier-type estimation of the power GARCH model with stable-Paretian
innovations- Abstract: Abstract
We consider estimation for general power GARCH models under stable-Paretian innovations. Exploiting the simple structure of the conditional characteristic function of the observations driven by these models we propose minimum distance estimation based on the empirical characteristic function of corresponding residuals. Consistency of the estimators is proved, and the asymptotic distribution of the estimator is studied. Efficiency issues are explored and finite-sample results are presented as well as applications of the proposed procedures to real data from the financial markets. A multivariate extension is also considered.
PubDate: 2016-05-01
- Abstract: Abstract
We consider estimation for general power GARCH models under stable-Paretian innovations. Exploiting the simple structure of the conditional characteristic function of the observations driven by these models we propose minimum distance estimation based on the empirical characteristic function of corresponding residuals. Consistency of the estimators is proved, and the asymptotic distribution of the estimator is studied. Efficiency issues are explored and finite-sample results are presented as well as applications of the proposed procedures to real data from the financial markets. A multivariate extension is also considered.
- Semiparametric estimation of a zero-inflated Poisson regression model with
missing covariates- Abstract: Abstract
Zero-inflated Poisson (ZIP) regression models have been widely used to study the effects of covariates in count data sets that have many zeros. However, often some covariates involved in ZIP regression modeling have missing values. Assuming that the selection probability is known or unknown and estimated via a non-parametric method, we propose the inverse probability weighting (IPW) method to estimate the parameters of the ZIP regression model with covariates missing at random. The asymptotic properties of the proposed estimators are studied in detail under certain regularity conditions. Both theoretical analysis and simulation results show that the semiparametric IPW estimator is more efficient than the true weight IPW estimator. The practical use of the proposed methodology is illustrated with data from a motorcycle survey of traffic regulations conducted in 2007 in Taiwan by the Ministry of Transportation and Communication.
PubDate: 2016-05-01
- Abstract: Abstract
Zero-inflated Poisson (ZIP) regression models have been widely used to study the effects of covariates in count data sets that have many zeros. However, often some covariates involved in ZIP regression modeling have missing values. Assuming that the selection probability is known or unknown and estimated via a non-parametric method, we propose the inverse probability weighting (IPW) method to estimate the parameters of the ZIP regression model with covariates missing at random. The asymptotic properties of the proposed estimators are studied in detail under certain regularity conditions. Both theoretical analysis and simulation results show that the semiparametric IPW estimator is more efficient than the true weight IPW estimator. The practical use of the proposed methodology is illustrated with data from a motorcycle survey of traffic regulations conducted in 2007 in Taiwan by the Ministry of Transportation and Communication.
- Minimum Hellinger distance estimation for bivariate samples and time
series with applications to nonlinear regression and copula-based models- Abstract: Abstract
We study minimum Hellinger distance estimation (MHDE) based on kernel density estimators for bivariate time series, such that various commonly used regression models and parametric time series such as nonlinear regressions with conditionally heteroscedastic errors and copula-based Markov processes, where copula densities are used to model the conditional densities, can be treated. It is shown that consistency and asymptotic normality of the MHDE basically follow from the uniform consistency of the density estimate and the validity of the central limit theorem for its integrated version. We also provide explicit sufficient conditions both for the i.i.d. case and the case of strong mixing series. In addition, for the case of i.i.d. data, we briefly discuss the asymptotics under local alternatives and relate the results to maximum likelihood estimation.
PubDate: 2016-05-01
- Abstract: Abstract
We study minimum Hellinger distance estimation (MHDE) based on kernel density estimators for bivariate time series, such that various commonly used regression models and parametric time series such as nonlinear regressions with conditionally heteroscedastic errors and copula-based Markov processes, where copula densities are used to model the conditional densities, can be treated. It is shown that consistency and asymptotic normality of the MHDE basically follow from the uniform consistency of the density estimate and the validity of the central limit theorem for its integrated version. We also provide explicit sufficient conditions both for the i.i.d. case and the case of strong mixing series. In addition, for the case of i.i.d. data, we briefly discuss the asymptotics under local alternatives and relate the results to maximum likelihood estimation.
- On the skewness of order statistics in multiple-outlier PHR models
- Abstract: Abstract
In this paper, we investigate the skewness of order statistics stemming from multiple-outlier proportional hazard rates samples in the sense of several variability orderings such as the star order, Lorenz order and dispersive order. It is shown that the more heterogeneity among the multiple-outlier components will lead to a more skewed lifetime of a k-out-of-n system consisting of these components. The results established here generalize the corresponding ones in Kochar and Xu (J Appl Probab 48:271–284, 2011, Ann Oper Res 212:127–138, 2014). Some numerical examples are also provided to illustrate the theoretical results.
PubDate: 2016-04-20
- Abstract: Abstract
In this paper, we investigate the skewness of order statistics stemming from multiple-outlier proportional hazard rates samples in the sense of several variability orderings such as the star order, Lorenz order and dispersive order. It is shown that the more heterogeneity among the multiple-outlier components will lead to a more skewed lifetime of a k-out-of-n system consisting of these components. The results established here generalize the corresponding ones in Kochar and Xu (J Appl Probab 48:271–284, 2011, Ann Oper Res 212:127–138, 2014). Some numerical examples are also provided to illustrate the theoretical results.
- Quantile inference based on clustered data
- Abstract: Abstract
One-sample sign test is one of the common procedures to develop distribution-free inference for a quantile of a population. A basic requirement of this test is that the observations in a sample must be independent. This assumption is violated in certain settings, such as clustered data, grouped data and longitudinal studies. Failure to account for dependence structure leads to erroneous statistical inferences. In this study, we have developed statistical inference for a population quantile of order p in either balanced or unbalanced designs by incorporating dependence structure when the distribution of within-cluster observations is exchangeable. We provide a point estimate, develop a testing procedure and construct confidence intervals for a population quantile of order p. Simulation studies are performed to demonstrate that the confidence intervals achieve their nominal coverage probabilities. We finally apply the proposed procedure to Academic Performance Index data.
PubDate: 2016-04-19
- Abstract: Abstract
One-sample sign test is one of the common procedures to develop distribution-free inference for a quantile of a population. A basic requirement of this test is that the observations in a sample must be independent. This assumption is violated in certain settings, such as clustered data, grouped data and longitudinal studies. Failure to account for dependence structure leads to erroneous statistical inferences. In this study, we have developed statistical inference for a population quantile of order p in either balanced or unbalanced designs by incorporating dependence structure when the distribution of within-cluster observations is exchangeable. We provide a point estimate, develop a testing procedure and construct confidence intervals for a population quantile of order p. Simulation studies are performed to demonstrate that the confidence intervals achieve their nominal coverage probabilities. We finally apply the proposed procedure to Academic Performance Index data.
- Statistical inference for critical continuous state and continuous time
branching processes with immigration- Abstract: Abstract
We study asymptotic behavior of conditional least squares estimators for critical continuous state and continuous time branching processes with immigration based on discrete time (low frequency) observations.
PubDate: 2016-04-12
- Abstract: Abstract
We study asymptotic behavior of conditional least squares estimators for critical continuous state and continuous time branching processes with immigration based on discrete time (low frequency) observations.
- Exceedances of records
- Abstract: Abstract
Given a sequence of random variables (rv’s) and a real function
\(\psi \)
, the
\(\psi \)
-exceedances form a subsequence consisting of those rv’s larger than a function,
\(\psi \)
, of the previous element of the subsequence. We present the basic distribution theory of
\(\psi \)
-exceedances for a sequence of independent and identically distributed rv’s. We give several examples and we study with more detail the case of exponential parents with
\(\psi \)
a linear function. The particular case of arithmetic exceedances is useful to describe the behavior of a type I counter when the arrival process of particles follows a non-homogeneous Poisson process. We also mention applications to destructive testing, early alert systems and the departure process of a
\(M_t/D/1/1\)
queue.
PubDate: 2016-04-12
- Abstract: Abstract
Given a sequence of random variables (rv’s) and a real function
\(\psi \)
, the
\(\psi \)
-exceedances form a subsequence consisting of those rv’s larger than a function,
\(\psi \)
, of the previous element of the subsequence. We present the basic distribution theory of
\(\psi \)
-exceedances for a sequence of independent and identically distributed rv’s. We give several examples and we study with more detail the case of exponential parents with
\(\psi \)
a linear function. The particular case of arithmetic exceedances is useful to describe the behavior of a type I counter when the arrival process of particles follows a non-homogeneous Poisson process. We also mention applications to destructive testing, early alert systems and the departure process of a
\(M_t/D/1/1\)
queue.
- A new joint model of recurrent event data with the additive hazards model
for the terminal event time- Abstract: Abstract
Recurrent event data are frequently encountered in clinical and observational studies related to biomedical science, econometrics, reliability and demography. In some situations, recurrent events serve as important indicators for evaluating disease progression, health deterioration, or insurance risk. In statistical literature, non informative censoring is typically assumed when statistical methods and theories are developed for analyzing recurrent event data. In many applications, however, there may exist a terminal event, such as death, that stops the follow-up, and it is the correlation of this terminal event with the recurrent event process that is of interest. This work considers joint modeling and analysis of recurrent event and terminal event data, with the focus primarily on determining how the terminal event process and the recurrent event process are correlated (i.e. does the frequency of the recurrent event influence the risk of the terminal event). We propose a joint model of the recurrent event process and the terminal event, linked through a common subject-specific latent variable, in which the proportional intensity model is used for modeling the recurrent event process and the additive hazards model is used for modeling the terminal event time.
PubDate: 2016-04-01
- Abstract: Abstract
Recurrent event data are frequently encountered in clinical and observational studies related to biomedical science, econometrics, reliability and demography. In some situations, recurrent events serve as important indicators for evaluating disease progression, health deterioration, or insurance risk. In statistical literature, non informative censoring is typically assumed when statistical methods and theories are developed for analyzing recurrent event data. In many applications, however, there may exist a terminal event, such as death, that stops the follow-up, and it is the correlation of this terminal event with the recurrent event process that is of interest. This work considers joint modeling and analysis of recurrent event and terminal event data, with the focus primarily on determining how the terminal event process and the recurrent event process are correlated (i.e. does the frequency of the recurrent event influence the risk of the terminal event). We propose a joint model of the recurrent event process and the terminal event, linked through a common subject-specific latent variable, in which the proportional intensity model is used for modeling the recurrent event process and the additive hazards model is used for modeling the terminal event time.
- Distribution function estimation via Bernstein polynomial of random degree
- Abstract: Abstract
The problem of distribution function (df) estimation arises naturally in many contexts. The empirical and the kernel df estimators are well known. There is another df estimator based on a Bernstein polynomial of degree m. For a Bernstein df estimator,
plays the same role as the bandwidth in a kernel estimator. The asymptotic properties of the Bernstein estimator has been studied so far assuming m is non random, chosen subjectively. We propose algorithms for data driven choice of m. Such an m is a function of the data, i.e. random. We obtain the convergence rates of a Bernstein df estimator, using a random m, for i.i.d., strongly mixing and a broad class of linear processes. The estimator is shown to be consistent for any stationary, ergodic process satisfying some conditions. Using simulations and analysis of real data the finite sample performance of the different df estimators are compared.
PubDate: 2016-04-01
- Abstract: Abstract
The problem of distribution function (df) estimation arises naturally in many contexts. The empirical and the kernel df estimators are well known. There is another df estimator based on a Bernstein polynomial of degree m. For a Bernstein df estimator,
plays the same role as the bandwidth in a kernel estimator. The asymptotic properties of the Bernstein estimator has been studied so far assuming m is non random, chosen subjectively. We propose algorithms for data driven choice of m. Such an m is a function of the data, i.e. random. We obtain the convergence rates of a Bernstein df estimator, using a random m, for i.i.d., strongly mixing and a broad class of linear processes. The estimator is shown to be consistent for any stationary, ergodic process satisfying some conditions. Using simulations and analysis of real data the finite sample performance of the different df estimators are compared.
- On local divergences between two probability measures
- Abstract: Abstract
A broad class of local divergences between two probability measures or between the respective probability distributions is proposed in this paper. The introduced local divergences are based on the classic Csiszár
\(\phi \)
-divergence and they provide with a pseudo-distance between two distributions on a specific area of their common domain. The range of values of the introduced class of local divergences is derived and explicit expressions of the proposed local divergences are also derived when the underlined distributions are members of the exponential family of distributions or they are described by multivariate normal models. An application is presented to illustrate the behavior of local divergences.
PubDate: 2016-04-01
- Abstract: Abstract
A broad class of local divergences between two probability measures or between the respective probability distributions is proposed in this paper. The introduced local divergences are based on the classic Csiszár
\(\phi \)
-divergence and they provide with a pseudo-distance between two distributions on a specific area of their common domain. The range of values of the introduced class of local divergences is derived and explicit expressions of the proposed local divergences are also derived when the underlined distributions are members of the exponential family of distributions or they are described by multivariate normal models. An application is presented to illustrate the behavior of local divergences.
- A note on the strong consistency of M-estimates in linear models
- Abstract: Abstract
We improve a known result on the strong consistency of M-estimates of the regression parameters in a linear model for independent and identically distributed random errors under some mild conditions.
PubDate: 2016-04-01
- Abstract: Abstract
We improve a known result on the strong consistency of M-estimates of the regression parameters in a linear model for independent and identically distributed random errors under some mild conditions.
- Generalized waiting time distributions associated with runs
- Abstract: Abstract
Let
\(\left\{ X_{t},t\ge 1\right\} \)
be a sequence of random variables with two possible values as either “1” (success) or “0” (failure). Define an independent sequence of random variables
\(\left\{ D_{i},i\ge 1\right\} \)
. The random variable
\(D_{i}\)
is associated with the success when it occupies the ith place in a run of successes. We define the weight of a success run as the sum of the D values corresponding to the successes in the run. Define the following two random variables:
\(N_{k}\)
is the number of trials until the weight of a single success run exceeds or equals k, and
\(N_{r,k}\)
is the number of trials until the weight of each of r success runs equals or exceeds k in
\(\left\{ X_{t},t\ge 1\right\} \)
. Distributional properties of the waiting time random variables
\(N_{k}\)
and
\(N_{r,k}\)
are studied and illustrative examples are presented.
PubDate: 2016-04-01
- Abstract: Abstract
Let
\(\left\{ X_{t},t\ge 1\right\} \)
be a sequence of random variables with two possible values as either “1” (success) or “0” (failure). Define an independent sequence of random variables
\(\left\{ D_{i},i\ge 1\right\} \)
. The random variable
\(D_{i}\)
is associated with the success when it occupies the ith place in a run of successes. We define the weight of a success run as the sum of the D values corresponding to the successes in the run. Define the following two random variables:
\(N_{k}\)
is the number of trials until the weight of a single success run exceeds or equals k, and
\(N_{r,k}\)
is the number of trials until the weight of each of r success runs equals or exceeds k in
\(\left\{ X_{t},t\ge 1\right\} \)
. Distributional properties of the waiting time random variables
\(N_{k}\)
and
\(N_{r,k}\)
are studied and illustrative examples are presented.
- EWMA control charts for detecting changes in the mean of a long-memory
process- Abstract: Abstract
In this paper EWMA control charts are introduced for detecting changes in the mean of a long-memory process. Besides the modified EWMA scheme the EWMA residual chart is also considered. The control design of the charts is calculated for an ARFIMA(p,d,q) process. In order to assess the introduced charts, the average run length is used as a performance criterion. Using an extensive simulation study, the control charts are compared with each other. The target process is assumed to be an ARFIMA(1,d,1) process.
PubDate: 2016-04-01
- Abstract: Abstract
In this paper EWMA control charts are introduced for detecting changes in the mean of a long-memory process. Besides the modified EWMA scheme the EWMA residual chart is also considered. The control design of the charts is calculated for an ARFIMA(p,d,q) process. In order to assess the introduced charts, the average run length is used as a performance criterion. Using an extensive simulation study, the control charts are compared with each other. The target process is assumed to be an ARFIMA(1,d,1) process.
- On cumulative residual (past) inaccuracy for truncated random variables
- Abstract: Abstract
To overcome the drawbacks of Shannon’s entropy, the concept of cumulative residual and past entropy has been proposed in the information theoretic literature. Furthermore, the Shannon entropy has been generalized in a number of different ways by many researchers. One important extension is Kerridge inaccuracy measure. In the present communication we study the cumulative residual and past inaccuracy measures, which are extensions of the corresponding cumulative entropies. Several properties, including monotonicity and bounds, are obtained for left, right and doubly truncated random variables.
PubDate: 2016-04-01
- Abstract: Abstract
To overcome the drawbacks of Shannon’s entropy, the concept of cumulative residual and past entropy has been proposed in the information theoretic literature. Furthermore, the Shannon entropy has been generalized in a number of different ways by many researchers. One important extension is Kerridge inaccuracy measure. In the present communication we study the cumulative residual and past inaccuracy measures, which are extensions of the corresponding cumulative entropies. Several properties, including monotonicity and bounds, are obtained for left, right and doubly truncated random variables.
- Multivariate Poisson distributions associated with Boolean models
- Abstract: Abstract
We consider a d-dimensional Boolean model
\(\varXi = (\varXi _1+X_1)\cup (\varXi _2+X_2)\cup \cdots \)
generated by a Poisson point process
\(\{X_i, i\ge 1\}\)
with intensity measure
\(\varLambda \)
and a sequence
\(\{\varXi _i, i\ge 1\}\)
of independent copies of some random compact set
\(\varXi _0\,\)
. Given compact sets
\(K_1,\ldots ,K_{\ell }\)
, we show that the discrete random vector
\((N(K_1),\ldots ,N(K_\ell ))\)
, where
\(N(K_j)\)
equals the number of shifted sets
\(\varXi _i+X_i\)
hitting
\(K_j\)
, obeys an
\(\ell \)
-variate Poisson distribution with
\(2^{\ell }-1\)
parameters. We obtain explicit formulae for all these parameters which can be estimated consistently from an observation of the union set
\(\varXi \)
in some unboundedly expanding window
\(W_n\)
(as
\(n \rightarrow \infty \)
) provided that the Boolean model is stationary. Some of these results can be extended to unions of Poisson k-cylinders for
\(1\le k < d\)
and more general set-valued functionals of independently marked Poisson processes.
PubDate: 2016-03-22
- Abstract: Abstract
We consider a d-dimensional Boolean model
\(\varXi = (\varXi _1+X_1)\cup (\varXi _2+X_2)\cup \cdots \)
generated by a Poisson point process
\(\{X_i, i\ge 1\}\)
with intensity measure
\(\varLambda \)
and a sequence
\(\{\varXi _i, i\ge 1\}\)
of independent copies of some random compact set
\(\varXi _0\,\)
. Given compact sets
\(K_1,\ldots ,K_{\ell }\)
, we show that the discrete random vector
\((N(K_1),\ldots ,N(K_\ell ))\)
, where
\(N(K_j)\)
equals the number of shifted sets
\(\varXi _i+X_i\)
hitting
\(K_j\)
, obeys an
\(\ell \)
-variate Poisson distribution with
\(2^{\ell }-1\)
parameters. We obtain explicit formulae for all these parameters which can be estimated consistently from an observation of the union set
\(\varXi \)
in some unboundedly expanding window
\(W_n\)
(as
\(n \rightarrow \infty \)
) provided that the Boolean model is stationary. Some of these results can be extended to unions of Poisson k-cylinders for
\(1\le k < d\)
and more general set-valued functionals of independently marked Poisson processes.
- On the records of multivariate random sequences
- Abstract: Abstract
Two types of records in multivariate sequences are considered in this paper. According to the first definition, a multivariate observation is accepted as a record if it is not dominated in at least one of the coordinates of previous record and the first observation is a record. Some basic straightforward results concerning the distributions of record times and records according to this definition are given. The development of distribution theory for these types of record and also providing examples with available analytical results still involves challenging unsolved problems. Second, we consider records of bivariate sequences according to conditionally N-ordering, introduced in Bairamov (J Multivar Anal 97:797–809, 2006). The joint distributions of record times and distributions of record values are derived. Some examples, with particular underlying distributions demonstrating the availability of obtained formulae are provided.
PubDate: 2016-02-03
- Abstract: Abstract
Two types of records in multivariate sequences are considered in this paper. According to the first definition, a multivariate observation is accepted as a record if it is not dominated in at least one of the coordinates of previous record and the first observation is a record. Some basic straightforward results concerning the distributions of record times and records according to this definition are given. The development of distribution theory for these types of record and also providing examples with available analytical results still involves challenging unsolved problems. Second, we consider records of bivariate sequences according to conditionally N-ordering, introduced in Bairamov (J Multivar Anal 97:797–809, 2006). The joint distributions of record times and distributions of record values are derived. Some examples, with particular underlying distributions demonstrating the availability of obtained formulae are provided.
- Exponential probability inequality for $$m$$ m -END random variables and
its applications- Abstract: Abstract
The concept of
\(m\)
-extended negatively dependent (
\(m\)
-END, in short) random variables is introduced and the Kolmogorov exponential inequality for
\(m\)
-END random variables is established. As applications of the Kolmogorov exponential inequality, we further investigate the complete convergence for arrays of rowwise
\(m\)
-END random variables and the complete consistency for the estimator of nonparametric regression models based on
\(m\)
-END errors. Our results generalize and improve some known ones for independent random variables and dependent random variables.
PubDate: 2016-02-01
- Abstract: Abstract
The concept of
\(m\)
-extended negatively dependent (
\(m\)
-END, in short) random variables is introduced and the Kolmogorov exponential inequality for
\(m\)
-END random variables is established. As applications of the Kolmogorov exponential inequality, we further investigate the complete convergence for arrays of rowwise
\(m\)
-END random variables and the complete consistency for the estimator of nonparametric regression models based on
\(m\)
-END errors. Our results generalize and improve some known ones for independent random variables and dependent random variables.
- Information bounds for nonparametric estimators of L -functionals and
survival functionals under censored data- Abstract: Abstract
In the present paper we derive lower asymptotic information bounds of Cramér-Rao type for estimators of nonparametric statistical functionals. The results are based on dense differentiability and dense regularity concepts which lead to weak assumptions. As explicit examples L-estimators are treated. In addition a new rapid method for the treatment of survival functionals under randomly right censored data is presented. For instance, for the famous Kaplan-Meier and Nelson-Aalen estimators, our information bound is just the lower bound obtained earlier in the literature.
PubDate: 2016-02-01
- Abstract: Abstract
In the present paper we derive lower asymptotic information bounds of Cramér-Rao type for estimators of nonparametric statistical functionals. The results are based on dense differentiability and dense regularity concepts which lead to weak assumptions. As explicit examples L-estimators are treated. In addition a new rapid method for the treatment of survival functionals under randomly right censored data is presented. For instance, for the famous Kaplan-Meier and Nelson-Aalen estimators, our information bound is just the lower bound obtained earlier in the literature.
- Erratum to: Testing structural changes in panel data with small fixed
panel size and bootstrap- PubDate: 2016-02-01
- PubDate: 2016-02-01