Metrika [3 followers] Follow Hybrid journal (It can contain Open Access articles) ISSN (Print) 1435-926X - ISSN (Online) 0026-1335 Published by Springer-Verlag [2209 journals] [SJR: 0.839] [H-I: 22] |
- Robust spline-based variable selection in varying coefficient model
- Abstract: Abstract
The varying coefficient model is widely used as an extension of the linear regression model. Many procedures have been developed for the model estimation, and recently efficient variable selection procedures for the varying coefficient model have been proposed as well. However, those variable selection approaches are mainly built on the least-squares (LS) type method. Although the LS method is a successful and standard choice in the varying coefficient model fitting and variable selection, it may suffer when the errors follow a heavy-tailed distribution or in the presence of outliers. To overcome this issue, we start by developing a novel robust estimator, termed rank-based spline estimator, which combines the ideas of rank inference and polynomial spline. Furthermore, we propose a robust variable selection method, incorporating the smoothly clipped absolute deviation penalty into the rank-based spline loss function. Under mild conditions, we theoretically show that the proposed rank-based spline estimator is highly efficient across a wide spectrum of distributions. Its asymptotic relative efficiency with respect to the LS-based method is closely related to that of the signed-rank Wilcoxon test with respect to the t test. Moreover, the proposed variable selection method can identify the true model consistently, and the resulting estimator can be as efficient as the oracle estimator. Simulation studies show that our procedure has better performance than the LS-based method when the errors deviate from normality.
PubDate: 2014-05-14
- Abstract: Abstract
The varying coefficient model is widely used as an extension of the linear regression model. Many procedures have been developed for the model estimation, and recently efficient variable selection procedures for the varying coefficient model have been proposed as well. However, those variable selection approaches are mainly built on the least-squares (LS) type method. Although the LS method is a successful and standard choice in the varying coefficient model fitting and variable selection, it may suffer when the errors follow a heavy-tailed distribution or in the presence of outliers. To overcome this issue, we start by developing a novel robust estimator, termed rank-based spline estimator, which combines the ideas of rank inference and polynomial spline. Furthermore, we propose a robust variable selection method, incorporating the smoothly clipped absolute deviation penalty into the rank-based spline loss function. Under mild conditions, we theoretically show that the proposed rank-based spline estimator is highly efficient across a wide spectrum of distributions. Its asymptotic relative efficiency with respect to the LS-based method is closely related to that of the signed-rank Wilcoxon test with respect to the t test. Moreover, the proposed variable selection method can identify the true model consistently, and the resulting estimator can be as efficient as the oracle estimator. Simulation studies show that our procedure has better performance than the LS-based method when the errors deviate from normality.
- Linearity of regression for overlapping order statistics
- Abstract: Abstract
We consider a problem of characterization of continuous distributions for which linearity of regression of overlapping order statistics,
$\mathbb {E}(X_{i:m} X_{j:n})=aX_{j:n}+b$
,
$m\le n$
, holds. Due to a new representation of conditional expectation
$\mathbb {E}(X_{i:m} X_{j:n})$
in terms of conditional expectations
$\mathbb {E}(X_{l:n} X_{j:n})$
,
$l=i,\ldots ,n-m+i$
, we are able to use the already known approach based on the Rao-Shanbhag version of the Cauchy integrated functional equation. However this is possible only if
$j\le i$
or
$j\ge n-m+i$
. In the remaining cases the problem essentially is still open.
PubDate: 2014-05-10
- Abstract: Abstract
We consider a problem of characterization of continuous distributions for which linearity of regression of overlapping order statistics,
$\mathbb {E}(X_{i:m} X_{j:n})=aX_{j:n}+b$
,
$m\le n$
, holds. Due to a new representation of conditional expectation
$\mathbb {E}(X_{i:m} X_{j:n})$
in terms of conditional expectations
$\mathbb {E}(X_{l:n} X_{j:n})$
,
$l=i,\ldots ,n-m+i$
, we are able to use the already known approach based on the Rao-Shanbhag version of the Cauchy integrated functional equation. However this is possible only if
$j\le i$
or
$j\ge n-m+i$
. In the remaining cases the problem essentially is still open.
- A robust two-stage procedure in Bayes sequential estimation of a
particular exponential family- Abstract: Abstract
The problem of Bayes sequential estimation of the unknown parameter in a particular exponential family of distributions is considered under linear exponential loss function for estimation error and a fixed cost for each observation. Instead of fully sequential sampling, a two-stage sampling technique is introduced to solve the problem in this paper. The proposed two-stage procedure is robust in the sense that it does not depend on the parameters of the conjugate prior. It is shown that the two-stage procedure is asymptotically pointwise optimal and asymptotically optimal for a large class of the conjugate priors. A simulation study is conducted to compare the performances of the two-stage procedure and the purely sequential procedure.
PubDate: 2014-05-09
- Abstract: Abstract
The problem of Bayes sequential estimation of the unknown parameter in a particular exponential family of distributions is considered under linear exponential loss function for estimation error and a fixed cost for each observation. Instead of fully sequential sampling, a two-stage sampling technique is introduced to solve the problem in this paper. The proposed two-stage procedure is robust in the sense that it does not depend on the parameters of the conjugate prior. It is shown that the two-stage procedure is asymptotically pointwise optimal and asymptotically optimal for a large class of the conjugate priors. A simulation study is conducted to compare the performances of the two-stage procedure and the purely sequential procedure.
- Optimal bounds on expectations of order statistics and spacings from
nonparametric families of distributions generated by convex transform
order- Abstract: Abstract
Assume that
$X_1,\ldots , X_n$
are i.i.d. random variables with a common distribution function
$F$
which precedes a fixed distribution function
$W$
in the convex transform order. In particular, if
$W$
is either uniform or exponential distribution function, then
$F$
has increasing density and failure rate, respectively. We present sharp upper bounds on the expectations of single order statistics and spacings based on
$X_1,\ldots , X_n$
, expressed in terms of the population mean and standard deviation, for the family of all parent distributions preceding
$W$
in the convex transform order. We also characterize the distributions which attain the bounds, and specify the general results for the distributions with increasing density function.
PubDate: 2014-05-04
- Abstract: Abstract
Assume that
$X_1,\ldots , X_n$
are i.i.d. random variables with a common distribution function
$F$
which precedes a fixed distribution function
$W$
in the convex transform order. In particular, if
$W$
is either uniform or exponential distribution function, then
$F$
has increasing density and failure rate, respectively. We present sharp upper bounds on the expectations of single order statistics and spacings based on
$X_1,\ldots , X_n$
, expressed in terms of the population mean and standard deviation, for the family of all parent distributions preceding
$W$
in the convex transform order. We also characterize the distributions which attain the bounds, and specify the general results for the distributions with increasing density function.
- Minimum distance lack-of-fit tests under long memory errors
- Abstract: Abstract
This paper discusses some tests of lack-of-fit of a parametric regression model when errors form a long memory moving average process with the long memory parameter
$0<d<1/2$
, and when design is non-random and uniform on
$[0,1]$
. These tests are based on certain minimized distances between a nonparametric regression function estimator and the parametric model being fitted. The paper investigates the asymptotic null distribution of the proposed test statistics and of the corresponding minimum distance estimators under minimal conditions on the model being fitted. The limiting distribution of these statistics are Gaussian for
$0<d<1/4$
and non-Gaussian for
$1/4<d<1/2$
. We also discuss the consistency of these tests against a fixed alternative. A simulation study is included to assess the finite sample behavior of the proposed test.
PubDate: 2014-05-03
- Abstract: Abstract
This paper discusses some tests of lack-of-fit of a parametric regression model when errors form a long memory moving average process with the long memory parameter
$0<d<1/2$
, and when design is non-random and uniform on
$[0,1]$
. These tests are based on certain minimized distances between a nonparametric regression function estimator and the parametric model being fitted. The paper investigates the asymptotic null distribution of the proposed test statistics and of the corresponding minimum distance estimators under minimal conditions on the model being fitted. The limiting distribution of these statistics are Gaussian for
$0<d<1/4$
and non-Gaussian for
$1/4<d<1/2$
. We also discuss the consistency of these tests against a fixed alternative. A simulation study is included to assess the finite sample behavior of the proposed test.
- class="a-plus-plus equation-source format-t-e-x">$L$
-statistics from multivariate unified skew-elliptical distributions
- Abstract: Abstract
We study here the distributions of order statistics and linear combinations of order statistics from a multivariate unified skew-elliptical distribution. We show that these distributions can be expressed as mixtures of unified skew-elliptical distributions, and then use these mixture forms to study some distributional properties and moments.
PubDate: 2014-05-01
- Abstract: Abstract
We study here the distributions of order statistics and linear combinations of order statistics from a multivariate unified skew-elliptical distribution. We show that these distributions can be expressed as mixtures of unified skew-elliptical distributions, and then use these mixture forms to study some distributional properties and moments.
- Convergence and performance of the peeling wavelet denoising algorithm
- Abstract: Abstract
This note is devoted to an analysis of the so-called peeling algorithm in wavelet denoising. Assuming that the wavelet coefficients of the useful signal are modeled by generalized Gaussian random variables and its noisy part by independent Gaussian variables, we compute a critical thresholding constant for the algorithm, which depends on the shape parameter of the generalized Gaussian distribution. We also quantify the optimal number of steps which have to be performed, and analyze the convergence of the algorithm. Several implementations are tested against classical wavelet denoising procedures on benchmark and simulated biological signals.
PubDate: 2014-05-01
- Abstract: Abstract
This note is devoted to an analysis of the so-called peeling algorithm in wavelet denoising. Assuming that the wavelet coefficients of the useful signal are modeled by generalized Gaussian random variables and its noisy part by independent Gaussian variables, we compute a critical thresholding constant for the algorithm, which depends on the shape parameter of the generalized Gaussian distribution. We also quantify the optimal number of steps which have to be performed, and analyze the convergence of the algorithm. Several implementations are tested against classical wavelet denoising procedures on benchmark and simulated biological signals.
- Random weighting estimation of stable exponent
- Abstract: Abstract
This paper presents a new random weighting method to estimation of the stable exponent. Assume that
$X_1, X_2, \ldots ,X_n$
is a sequence of independent and identically distributed random variables with
$\alpha $
-stable distribution G, where
$\alpha \in (0,2]$
is the stable exponent. Denote the empirical distribution function of G by
$G_n$
and the random weighting estimation of
$G_n$
by
$H_n$
. An empirical distribution function
$\widetilde{F}_n$
with U-statistic structure is defined based on the sum-preserving property of stable random variables. By minimizing the Cramer-von-Mises distance between
$H_n$
and
${\widetilde{F}}_n$
, the random weighting estimation of
$\alpha $
is constructed in the sense of the minimum distance. The strong consistency and asymptotic normality of the random weighting estimation are also rigorously proved. Experimental results demonstrate that the proposed random weighting method can effectively estimate the stable exponent, resulting in higher estimation accuracy than the Zolotarev, Press, Fan and maximum likelihood methods.
PubDate: 2014-05-01
- Abstract: Abstract
This paper presents a new random weighting method to estimation of the stable exponent. Assume that
$X_1, X_2, \ldots ,X_n$
is a sequence of independent and identically distributed random variables with
$\alpha $
-stable distribution G, where
$\alpha \in (0,2]$
is the stable exponent. Denote the empirical distribution function of G by
$G_n$
and the random weighting estimation of
$G_n$
by
$H_n$
. An empirical distribution function
$\widetilde{F}_n$
with U-statistic structure is defined based on the sum-preserving property of stable random variables. By minimizing the Cramer-von-Mises distance between
$H_n$
and
${\widetilde{F}}_n$
, the random weighting estimation of
$\alpha $
is constructed in the sense of the minimum distance. The strong consistency and asymptotic normality of the random weighting estimation are also rigorously proved. Experimental results demonstrate that the proposed random weighting method can effectively estimate the stable exponent, resulting in higher estimation accuracy than the Zolotarev, Press, Fan and maximum likelihood methods.
- Non-positive upper bounds on expectations of small order statistics from
DDA and DFRA populations- Abstract: Abstract
Rychlik [Appl Math (Warsaw) 29:15–32, 2002] presented positive sharp upper bounds on the expectations of order statistics with sufficiently large ranks, based on i.i.d. samples from the decreasing density and failure rate populations (DDA and DFRA, for short). They were expressed in terms of the population mean and standard deviation. Here we provide respective non-positive upper tight evaluations for expected small order statistics centered about the population mean, measured in various scale units.
PubDate: 2014-05-01
- Abstract: Abstract
Rychlik [Appl Math (Warsaw) 29:15–32, 2002] presented positive sharp upper bounds on the expectations of order statistics with sufficiently large ranks, based on i.i.d. samples from the decreasing density and failure rate populations (DDA and DFRA, for short). They were expressed in terms of the population mean and standard deviation. Here we provide respective non-positive upper tight evaluations for expected small order statistics centered about the population mean, measured in various scale units.
- On Riesz distribution
- Abstract: Abstract
The Riesz distribution for real normed division algebras is derived in this work. Then two versions of these distributions are proposed and some of their properties are studied.
PubDate: 2014-05-01
- Abstract: Abstract
The Riesz distribution for real normed division algebras is derived in this work. Then two versions of these distributions are proposed and some of their properties are studied.
- Follow-up experiments for two-level fractional factorial designs via
double semifoldover- Abstract: Abstract
The addition of another fraction to an initial experiment is often necessary to resolve ambiguities involving aliasing of factorial effects. One of the most widely used techniques for the selection of a follow-up experiment is foldover. However, semifoldover (i.e., adding half of a foldover fraction) frequently permits estimation of as many effects of interest as provided by a foldover. Thus, as an alternative to foldover, this article investigates the construction and theoretical properties of follow-up experiments obtained via the addition of two
$n/2$
-run semifoldover fractions. The strategy (termed double semifolding) provides a means of estimating more effects than can be achieved with a foldover design. Through the use of indicator functions, general properties of double semifoldover designs will be developed. Optimal double semifoldover plans, based on several established design criteria, will be discussed and tabulated for practical use.
PubDate: 2014-05-01
- Abstract: Abstract
The addition of another fraction to an initial experiment is often necessary to resolve ambiguities involving aliasing of factorial effects. One of the most widely used techniques for the selection of a follow-up experiment is foldover. However, semifoldover (i.e., adding half of a foldover fraction) frequently permits estimation of as many effects of interest as provided by a foldover. Thus, as an alternative to foldover, this article investigates the construction and theoretical properties of follow-up experiments obtained via the addition of two
$n/2$
-run semifoldover fractions. The strategy (termed double semifolding) provides a means of estimating more effects than can be achieved with a foldover design. Through the use of indicator functions, general properties of double semifoldover designs will be developed. Optimal double semifoldover plans, based on several established design criteria, will be discussed and tabulated for practical use.
- Optimal crossover designs in a model with self and mixed carryover effects
with correlated errors- Abstract: Abstract
We determine optimal crossover designs for the estimation of direct treatment effects in a model with mixed and self carryover effects. The model also assumes that the errors within each experimental unit are correlated following a stationary first-order autoregressive process. The paper considers situations where the number of periods for each experimental unit is at least four and the number of treatments is greater or equal to the number of periods.
PubDate: 2014-04-30
- Abstract: Abstract
We determine optimal crossover designs for the estimation of direct treatment effects in a model with mixed and self carryover effects. The model also assumes that the errors within each experimental unit are correlated following a stationary first-order autoregressive process. The paper considers situations where the number of periods for each experimental unit is at least four and the number of treatments is greater or equal to the number of periods.
- Data transformations and goodness-of-fit tests for type-II right censored
samples- Abstract: Abstract
We suggest several goodness-of-fit (GOF) methods which are appropriate with Type-II right censored data. Our strategy is to transform the original observations from a censored sample into an approximately i.i.d. sample of normal variates and then perform a standard GOF test for normality on the transformed observations. A simulation study with several well known parametric distributions under testing reveals the sampling properties of the methods. We also provide theoretical analysis of the proposed method.
PubDate: 2014-04-23
- Abstract: Abstract
We suggest several goodness-of-fit (GOF) methods which are appropriate with Type-II right censored data. Our strategy is to transform the original observations from a censored sample into an approximately i.i.d. sample of normal variates and then perform a standard GOF test for normality on the transformed observations. A simulation study with several well known parametric distributions under testing reveals the sampling properties of the methods. We also provide theoretical analysis of the proposed method.
- Focused vector information criterion model selection and model averaging
regression with missing response- Abstract: Abstract
In this paper, a focused vector information criterion for model selection and model averaging is considered for the linear model with missing response. Based on the focused information criterion of Hjort and Claeskens (J Am Stat Assoc 98:879–945, 2003) and imputation idea, a frequentist model averaging estimator for a focused vector of a linear model is proposed, and the estimator is shown to be root-n consistent and asymptotical normal. In addition, the proposed focused vector information criterion is designed for focused multidimensional parameter, which is a little different from conventional focused information criterion for one dimensional focused parameter. A model averaging based confidence interval estimation method and estimation of the mean of the response are also proposed. A simulation study is conducted to investigate the performance of the proposed estimator with finite sample sizes and a real data example is presented to illustrate its application in practice.
PubDate: 2014-04-01
- Abstract: Abstract
In this paper, a focused vector information criterion for model selection and model averaging is considered for the linear model with missing response. Based on the focused information criterion of Hjort and Claeskens (J Am Stat Assoc 98:879–945, 2003) and imputation idea, a frequentist model averaging estimator for a focused vector of a linear model is proposed, and the estimator is shown to be root-n consistent and asymptotical normal. In addition, the proposed focused vector information criterion is designed for focused multidimensional parameter, which is a little different from conventional focused information criterion for one dimensional focused parameter. A model averaging based confidence interval estimation method and estimation of the mean of the response are also proposed. A simulation study is conducted to investigate the performance of the proposed estimator with finite sample sizes and a real data example is presented to illustrate its application in practice.
- A study on reliability of coherent systems equipped with a cold standby
component- Abstract: Abstract
In this paper, we investigate the effect of a single cold standby component on the performance of a coherent system. In particular, we focus on coherent systems which may fail at the time of the first component failure in the system. We obtain signature based expressions for the survival function and mean time to failure of the coherent systems satisfying the abovementioned property.
PubDate: 2014-04-01
- Abstract: Abstract
In this paper, we investigate the effect of a single cold standby component on the performance of a coherent system. In particular, we focus on coherent systems which may fail at the time of the first component failure in the system. We obtain signature based expressions for the survival function and mean time to failure of the coherent systems satisfying the abovementioned property.
- Dependence properties of bivariate distributions with proportional
(reversed) hazards marginals- Abstract: Abstract
This paper considers two classes of bivariate distributions having proportional (reversed) hazard rates models as their marginals. Various dependence properties of the proposed models are studied through their copulas.
PubDate: 2014-04-01
- Abstract: Abstract
This paper considers two classes of bivariate distributions having proportional (reversed) hazard rates models as their marginals. Various dependence properties of the proposed models are studied through their copulas.
- Strong consistency of least squares estimates in multiple regression
models with random regressors- Abstract: Abstract
The strong consistency of the least squares estimator in multiple regression models is established assuming the randomness of the regressors and errors with infinite variance. Only moderately restrictive conditions are imposed on the stochastic model matrix and the errors will be random variables having moment of order
$r,\,1 \leqslant r \leqslant 2$
. In our treatment, we use Etemadi’s strong law of large numbers and a sharp almost sure convergence for randomly weighted sums of random elements. Both techniques permit us to extend the results of some previous papers.
PubDate: 2014-04-01
- Abstract: Abstract
The strong consistency of the least squares estimator in multiple regression models is established assuming the randomness of the regressors and errors with infinite variance. Only moderately restrictive conditions are imposed on the stochastic model matrix and the errors will be random variables having moment of order
$r,\,1 \leqslant r \leqslant 2$
. In our treatment, we use Etemadi’s strong law of large numbers and a sharp almost sure convergence for randomly weighted sums of random elements. Both techniques permit us to extend the results of some previous papers.
- Design and analysis of shortest two-sided confidence intervals for a
probability under prior information- Abstract: Abstract
Two-sided confidence intervals for a probability
$p$
under a prescribed confidence level
$\gamma $
are an elementary tool of statistical data analysis. A confidence interval has two basic quality characteristics: i) exactness, i. e., whether the actual coverage probability equals or exceeds the prescribed level
$\gamma $
; ii) inferential precision, measured by the length of the confidence interval. The interval provided by Clopper and Pearson (Biometrika 26:404–413, 1934) is the only exact interval actually used in statistical data analysis. Various authors have suggested shorter, i. e., more precise exact intervals. The present paper makes two contributions. i) We provide a general design scheme for minimum volume confidence regions under prior knowledge on the target parameter. ii) We apply the scheme to the problem of confidence intervals for a probability
$p$
where prior knowledge is expressed in a flexible way by a beta distribution on a subset of the unit interval.
PubDate: 2014-04-01
- Abstract: Abstract
Two-sided confidence intervals for a probability
$p$
under a prescribed confidence level
$\gamma $
are an elementary tool of statistical data analysis. A confidence interval has two basic quality characteristics: i) exactness, i. e., whether the actual coverage probability equals or exceeds the prescribed level
$\gamma $
; ii) inferential precision, measured by the length of the confidence interval. The interval provided by Clopper and Pearson (Biometrika 26:404–413, 1934) is the only exact interval actually used in statistical data analysis. Various authors have suggested shorter, i. e., more precise exact intervals. The present paper makes two contributions. i) We provide a general design scheme for minimum volume confidence regions under prior knowledge on the target parameter. ii) We apply the scheme to the problem of confidence intervals for a probability
$p$
where prior knowledge is expressed in a flexible way by a beta distribution on a subset of the unit interval.
- Construction of nearly orthogonal Latin hypercube designs
- Abstract: Abstract
The Latin hypercube design (LHD) is a popular choice of experimental design when computer simulation is used to study a physical process. In this paper, we propose some methods for constructing nearly orthogonal Latin hypercube designs (NOLHDs) with 2, 4, 8, 12, 16, 20 and 24 factors having flexible run sizes. These designs can be very useful when orthogonal Latin hypercube designs (OLHDs) of the needed sizes do not exist.
PubDate: 2014-04-01
- Abstract: Abstract
The Latin hypercube design (LHD) is a popular choice of experimental design when computer simulation is used to study a physical process. In this paper, we propose some methods for constructing nearly orthogonal Latin hypercube designs (NOLHDs) with 2, 4, 8, 12, 16, 20 and 24 factors having flexible run sizes. These designs can be very useful when orthogonal Latin hypercube designs (OLHDs) of the needed sizes do not exist.
- Rank tests in heteroscedastic linear model with nuisance parameters
- Abstract: Abstract
In the linear regression model with heteroscedastic errors, we propose nonparametric tests for regression under nuisance heteroscedasticity, and tests for heteroscedasticity under nuisance regression. Both types of tests are based on suitable ancillary statistics for the nuisance parameters; hence they avoid their estimation, in contradistinction to tests proposed in the literature. A simulation study, as well as an application of tests to real data, illustrate their good performance.
PubDate: 2014-04-01
- Abstract: Abstract
In the linear regression model with heteroscedastic errors, we propose nonparametric tests for regression under nuisance heteroscedasticity, and tests for heteroscedasticity under nuisance regression. Both types of tests are based on suitable ancillary statistics for the nuisance parameters; hence they avoid their estimation, in contradistinction to tests proposed in the literature. A simulation study, as well as an application of tests to real data, illustrate their good performance.