Journal Volume: 64      No.: 2     Year: 2010
S.No Title Abstract Download
1 Ambiguities in the Basics of Probability Theory and Implications
Author: Jagdish N. Srivastava      Pages: 125-127
The purpose of this very short paper is to discuss a fundamental ambiguity in probability theory, and an implication of the same (in the Foundations of Physics). Details of this and other implications will be discussed elsewhere. The paper does not present any new mathematical result; rather, it emphasizes a certain feature in the basics of probability theory. Keywords: Probability theory, Wave function.
2 Role of Weights in Descriptive and Analytical Inferences from Survey Data: An Overview
Author: J.N.K. Rao, M. Hidiroglou, W. Yung and M. Kovacevic      Pages: 129-135
Statistical agencies generally collect data from samples drawn from well defined finite populations and using complex sampling procedures that may include stratification, clustering and multi-stage sampling and using unequal probabilities of selection. Sample design weights, defined from the sampling procedures, are often adjusted to account for non-responding units and to calibrate to known population totals of auxiliary variables. Once adjusted, these ?final? weights are included on the survey datasets. There has been some discussion on the necessity of using these weights in the estimation of descriptive statistics and to perform analysis of data from these surveys. In this paper, we discuss the role of weights in descriptive and analytical inference. Keywords: Calibration, Design Weights, Re-sampling methods, Multi-level models, Survey data analysis.
3 Sequential Cramer-Rao and Bhattacharyya Bounds: Work of G.R. Seth and Afterwards
Author: J.K. Ghosh and Sumitra Purkayashta      Pages: 137-144
A sequential version of Cramér-Rao inequality was obtained by Wolfowitz (1947). Bhattacharyya inequality (1946, 1947) can be seen as a refinement of Cramér-Rao inequality. We discuss at length its sequential version as obtained by Seth (1949). We also discuss results of Seth and others, notably Ghosh (1987), on the impossibility of attainment of equality in sequential Cramér-Rao inequality, where attainment will mean here and thereafter attainment for all values of the underlying parameter ? except where it is stated otherwise. We also discuss briefly why sequential estimation remains important, notwithstanding the above non-existence results. Keywords: Bhattacharyya inequality, Cramér-Rao inequality, Sequential estimation.
4 Smooth Density Estimation for Length-Biased Data
Author: Yogendra P. Chaubey, Pranab K. Sen and Jun Li      Pages: 145-155
Biased data frequently arise in applications concerning wild life and human populations as indicated in the comprehensive article by Patil and Rao (1978). Such data follow densities that are proportional to the original population density. Here we provide a non-parametric estimator of the density that is based on the smoothing of the Cox?s (1969) estimator using Poisson weights. The new method that is appropriate for nonnegative data is contrasted with some estimators in literature based on nonparametric kernel smoothing. Based on simulation studies, it is shown that the new estimator fares better in terms of the Mean Integrated Squared Error (MISE) compared to kernel based estimators. The asymptotic consistency and normality of the new estimator is also established under standard regularity conditions. Keywords: Length-biased density, Weighted distribution, Nonparametric density estimation.
5 Random Design in Regression Models
Author: Moti L. Tiku and Aysen D. Akkaya      Pages: 157-170
In regression models the design variable has traditionally been assumed to be non-stochastic. In most real life situations, however, the design variable is stochastic having a non-normal distribution as the response error. Modified maximum likelihood method is utilized to estimate unknown parameters in such situations. The resulting estimators are shown to be efficient and robust. A real life example is given. Keywords: Random design, Regression, Non-normality, Least squares, Maximum likelihood, Modified maximum likelihood, Efficiency, Outliers, Inliers, Robustness.
6 A Finite Population Bayes Procedure for Censored Categorical Abundance Data
Author: Mark D. Holland, Glen Meeden and Brian R. Gray      Pages: 171-175
We propose a Bayes procedure for estimating categorical abundance using data that are observed with error from a random sample from a finite population. The procedure is designed to estimate the proportion of sites in a finite population that belong to each abundance category. Royle and Link (2005) proposed a multinomial mixture model to analyze data of this nature. Holland and Gray (2010) demonstrated that category means would exhibit bias when probabilities of correct category classifications vary among sampling units and this heterogeneity is not modeled. Those authors proposed a modification to the multinomial mixture model that allows correct classification probabilities to vary by sampling unit according to a single normal distribution on a common logit scale. Our proposal allows both correct and incorrect classification probabilities to vary by site and does not require strong assumptions about the nature of the heterogeneity in classification probabilities. We analyze submerged aquatic vegetation data collected by the Long Term Resource Monitoring Program and compare our results to those of Holland and Gray (2010). We also provide simulation results to demonstrate the performance of our proposal and associated credible intervals under several prior distributions. Keywords: Categorical abundance, Multinomial mixture, Bayes procedure.
7 Inferences in Longitudinal Mixed Models for Survey Data
Author: Brajendra Sutradhar, R. Prabhakar Rao and V.N. Pandit      Pages: 177-189
In sample survey based longitudinal set up, first, a suitable sample of individuals are chosen from a finite or survey population by using an appropriate sampling technique such as two stage cluster sampling. Secondly, a response of interest along with a set of multi-dimensional covariates are collected from each individual of the sample, over a small period of time. These repeated responses exhibit longitudinal correlations. Also, on top of the influence of time dependent covariates, the responses of the individual may further be influenced by an individual random effect. Since, the so-called generalized quasi- likelihood (GQL) approach has been shown to be an efficient estimation approach for both regression effects and random effects variance involved in an infinite population based longitudinal mixed model (Sutradhar et al. 2008), for example), in this paper we demonstrate how to develop the sample survey based GQL estimating equations for the estimation of the desired finite population parameters both in linear dynamic and binary dynamic mixed models set up. We also illustrate the sampling design weights based GQL estimation methodology by re-analyzing the SLID (Survey of Labor and Income Dynamic) data from Statistics Canada. Keywords: Consistency, Dynamic dependence parameters, Efficiency, Random effects, Regression effects, Sampling design weights, Variance components.
8 Bayesian Predictive Inference for Benchmarking Crop Production for Iowa Counties
Author: Balgobin Nandram and Ma. Criselda S. Toto      Pages: 191-207
For a long time satellite and survey data have been used to estimate crop and livestock production at county level. Typically prediction is required for counties (small areas), and parametric models have been discussed extensively. The main goal in small area estimation is to use models to ?borrow strength? from the ensemble because the direct estimates of small area parameters are generally unreliable. But such models are not completely satisfactory. We address two issues concerning these models. First, the combined estimates from all small areas do not usually match the value of the single estimate of the large area, and benchmarking is desirable. Benchmarking is done by applying a constraint that will ensure that the ?total? of the small areas matches the ?grand total?. We use a Bayesian nested error regression model to develop a method to benchmark the finite population means of small areas. Second, it is the practice to assume that the sampling variances are homogeneous, but this may not be the case. Thus, in addition to benchmarking, we also show how to study heterogeneous sampling variances. We apply our method to estimate the number of acres of corn and soybean under cultivation for twelve counties in Iowa. Keywords: Heterogeneous variances, Monte Carlo methods, Nested-error regression model, Posterior propriety, Small area estimation.
9 Smooth Estimation of Survival and Density Functions for Stationary Associated Sequences: Some Recent Developments
Author: Yogendra P. Chaubey and Isha Dewan      Pages: 261-272
Consider a sequence of stationary non-negative associated random variables with common marginal density f(x). Here we present a review of recent developments for estimating the density f and the corresponding survival function by smoothing the empirical survival function studied in Bagai and Prakasa Rao (1991). These are contrasted with other estimators available for non-negative i.i.d. data. Keywords: Associated sequence, Asymmetric kernel estimator, Strong consistency, Survival function.
10 Variance Estimation of a Generalized Regression Predictor
Author: Raghunath Arnab, D.K. Shangodoyin and Sarjinder Singh      Pages: 273-288
The generalized regression predictor (greg) is used for the estimation of a finite population total when the study variable is well related to the auxiliary variable. Särndal (1982), proposed a few estimators for variance of the greg. In this paper, we have derived the lower bounds of variances of the estimators of the variances of greg belonging to certain classes of estimators under a superpopulation model. The proposed optimal variance estimators attaining the lower bound cannot be used in practice since they involve unknown parameters. Hence, some alternative variance estimators are proposed. Simulation studies reveal that the proposed alternative estimators are more efficient than existing alternatives proposed by Särndal (1982). Keywords: Generalized regression predictor, Superpopulation Model, Variance estimation.
11 Modeling Unstructured Heterogeneity along with Spatially Correlated Errors in Field Trials
Author: M. Singh, Y.P. Chaubey. A. Sarker and D. Sen      Pages: 313-321
In this paper we consider analysis of two experimental data sets for evaluating lentil genotypes. One of these data sets comes from an incomplete block design and the other one from a complete block design. The incomplete blocks contribute to the experimental error reduction and spatially correlated plot-errors can be modeled using autoregressive scheme that may lead to further improvement in the assessment of the genotypes. Such an approach was applied in several other studies to model the linear trends and spatially correlated errors. However, the assumption of a constant error variance restricts the scope of the analysis in many agricultural field trials, and in other situations in general, where heterogeneity of error variances is a reality. In this study, we have approached the problem first by fitting a model with constant error variance and generating the residuals. Using the squared residuals, we use K-cluster means technique to group the experimental units for similar squared- residuals. Next, we allow the error variances to vary with the group of the experimental units which need not require any spatial restrictions to model the error variances. The number of heterogeneous errors and the experimental units belonging to the heterogeneous clusters are obtained using the AIC criterion values followed by a groups merger scheme based on insignificant change in the residual maximum log likelihood values. The final models with heterogeneous variances were used to evaluate the precision of the genotype means comparisons. We found a substantial improvement on the effciency of the pair-wise comparisons over the other ways of analysis. We recommend the application of this procedure in any general situation permitting unstructured heterogeneity. Keywords: Heterogeneous error variances, Spatially correlated errors, Variogram, Clustering, Field trials.
12 Editorial Board
Author: ISAS      Pages: 2
13 Hindi Supplement
Author: ISAS      Pages: 323-330
14 Office Bearers
Author: ISAS      Pages: 1
15 Design, Implementation and Analytical Methods for a Countywide West Nile Virus Seroprevalance Survey
Author: Christopher Kippes and Joseph Sedransk      Pages: 209-218
During 2002 there were 221 confirmed or probable cases of West Nile Virus (WNV) in Cuyahoga County (located in Northeast Ohio) ? accounting for 71% of all Ohio cases. In December 2002, the public health community of Cuyahoga County conducted a household-based seroprevalence survey designed to estimate focal and county-wide WNV infection rates. In this article the authors provide a detailed description of the field operations used to conduct a countywide serologic survey for WNV and describe the methodology used to obtain and analyze a probability-based sample of households. Field operations were based on incident command structure (ICS) resulting in the recruitment of over 1,200 eligible participants. Although ICS has not been routinely incorporated into traditional public health investigations, it was successfully implemented in this survey. Additionally, the sampling design used in this survey may be helpful in situations where the characteristic of interest has a small, variable probability of occurrence and it is desired to find a large number of individuals with this characteristic - to permit one to relate local rates to local conditions. During 2002 there were 221 confirmed or probable cases of West Nile Virus (WNV) in Cuyahoga County (located in Northeast Ohio) ? accounting for 71% of all Ohio cases. In December 2002, the public health community of Cuyahoga County conducted a household-based seroprevalence survey designed to estimate focal and county-wide WNV infection rates. In this article the authors provide a detailed description of the field operations used to conduct a countywide serologic survey for WNV and describe the methodology used to obtain and analyze a probability-based sample of households. Field operations were based on incident command structure (ICS) resulting in the recruitment of over 1,200 eligible participants. Although ICS has not been routinely incorporated into traditional public health investigations, it was successfully implemented in this survey. Additionally, the sampling design used in this survey may be helpful in situations where the characteristic of interest has a small, variable probability of occurrence and it is desired to find a large number of individuals with this characteristic - to permit one to relate local rates to local conditions.
16 Multivariate Directed Inference with Modified Hotellings T-Squared
Author: John J. Carson Jr. and Arjun K. Gupta      Pages: 219-228
This paper presents an approach to the problem of multivariate directed inference, a generalization of one-side testing, in the setting of vector and matrix valued elliptically contoured (MEC) random variables. The T 2 statistic, a modification of Hotelling?s T , is introduced. It gives a sensitive test of positivity in one or more components of a location vector, which is nonparametric over the MEC family. The T 2 statistic uses the positive part of the sample mean vector or of the difference between a sample mean vector and a reference vector. Other hypotheses, including order restrictions, may be tested by suitably transforming the data. The test is derived from the Generalized Likelihood Ratio Test and by the union-intersection principle. Principal properties and the null and power distributions are given. Keywords: Multivariate analysis, Hotelling?s T2, One-sided testing, Elliptically contoured distributions, Union-intersection test, Generalized likelihood ratio test.
17 Using the Logistic pdf Model to Mitigate Autocorrelation in Growth Curve Analysis
Author: James H. Matis, Muhammed Jassem Al-Muhammed and Wopke van der Werf      Pages: 229-236
The well-known Verhulst-Pearl model in ecology,y? = (1 ? 6 ? y(t)) ? y(t) where y(t) denotes current population size, has a solution which may be written in the form of a logistic cumulative distribution function (cdf). This function is widely used to describe population growth curves. However population growth data are prone to serial correlation, which would complicate subsequent statistical inferences. The serial correlation in the data can be mitigated by fitting the first differences of the population data to a model for the rate of population growth. This function is the solution to the alternative mechanistic model, y? = (1 ? 6 ? Y(t)) ? y(t) where Y(t) is the integral of y(s) from 0 to t, and it has the mathematical form of a logistic probability density function (pdf). A biologically meaningful parameterization of the logistic pdf model is provided to facilitate initial estimates for parameters in nonlinear curve fitting. We illustrate the procedure, demonstrating the problem of serial correlation in population data and the effectiveness of the suggested solution, by fitting two classic data sets. Keywords: Verhulst-Pearl model, Cumulative size dependency, Aphid population size model.
18 Other Publications
Author: ISAS      Pages: 1
19 Preface
Author: ISAS      Pages: 2
20 Cover
Author: ISAS      Pages: 2
21 Linear and Nonlinear Approximations of the Ratio of the Standard Normal Density and Distribution Functions for the Estimation of the Skew Normal Shape Parameter
Author: Subir Ghosh and Debarishi Dey      Pages: 237-242
We introduce a linear approximation and a nonlinear approximation of the ratio of the standard normal density and distribution functions in presence of an unknown constant representing the shape parameter of the skew normal distribution. The purpose of these approximations is to estimate the skew normal shape parameter. We present a new estimation method of the shape parameter based on these approximations. The simulation results demonstrate that the approximations strongly resemble their true values in the regions of interest and the estimated biases of the shape parameter are small. Keywords: Likelihood, Linear approximation, Nonlinear approximation, Skew normal, Standard normal.
22 A Simple Method for Bayesian Robust Estimation
Author: Guan Xing and J. Sunil Rao      Pages: 243-253
We introduce a new Bayesian robust estimation approach to deal with contaminated data. The formulation is based on latent indicator variables which are used to down-weight potential outliers. The posterior distributions (and functionals) of the parameters of interest and the indicator variables are derived using a Gibbs sampler. A diagnostic plot from the posterior distribution of the latent variables provides visual evidence of the relative weights attached to each observation. This approach is simple and rather general in its applicability. We show examples from linear and generalized linear regression, as well as multivariate estimation. Keywords : Bayesian, Robust estimation, Gibbs sampler.
23 Variance Estimation for the Regression Estimator of the Mean in Stratified Sampling
Author: Sat Gupta and Javid Shabbir      Pages: 255-260
In this paper, we propose a class of estimators for variance of separate regression estimator of mean in stratified sampling and derive its properties under large sample approximation. The proposed class of estimators performs better than the traditional regression estimator and the Wu (1985) estimator. Mean square errors of different estimators are compared numerically also using three different data sets from the literature. Keywords: Separate regression estimator, Stratification, Bias, Mean square error, Efficiency.
24 On Some Aspects of Regression in Social Network Models
Author: Bikas K. Sinha      Pages: 289-293
This paper deals with some probabilistic aspects of social networks, viewed as directed graphs or digraphs. The vertices are connected by one-way or two-way edges. A simple model for random generation of such edges between pairs of vertices is discussed. The concepts of out-degrees and in-degrees of vertices are mentioned. The total of out-degree and in-degree of a vertex may be regarded as the extent of ?total number of moves? experienced by that vertex. Distributional aspects of these vertex-oriented random variables are discussed with special reference to the nature of regression. Keywords: Regression, Digraph, Directed graph, Out-degree, In-degree, Density.
25 Dialectical Estimation
Author: Prem Narain      Pages: 295-301
A novel approach of dialectical estimation is discussed in which averaging two somewhat contradictory estimates from the same individual reduces error which is larger in magnitude than the reduction in sampling error expected when two estimates from the same individual are obtained on the basis of an internal probability distribution. Such procedures are useful in judgment sampling where a single judge is available for evaluation and his subjective assessment is affected by systematic error in addition to random error. Hegelian dialectics is the basis of such a strategy and can lead to a series of dialectical averages that tend to the true value of the characteristic under investigation. Keywords: Hegelian dialectics, Dialectical estimation, Judgment sampling, Internal probability distribution, Date-estimation experiment, Mental tool.
26 Linear Integer Programming Approach to Construct Distance Balanced Sampling Plans
Author: B.N. Mandal. Rajender Prasad and V.K. Gupta      Pages: 303-312
Distance balanced sampling plans (DBSP) are a class of sampling plans in which second order inclusion probabilities are non-decreasing function of the distance between the population units. DBSP were introduced by Mandal et al. (2009) as a generalization of balanced sampling plans excluding adjacent units. In this article, a general w-point DBSP (w = 1, 2, ?, ? N ? , where N is the population size and [x] denotes largest integer contained in x) is introduced and a method of construction of w-point DBSP using linear integer programming is proposed. The method is general in nature and two-point, three-point, ? N ? -point, many other DBSPs, simple random sampling without replacement, balanced sampling plans excluding contiguous units (Hedayat et al., 1988) and balanced sampling plans excluding adjacent units (Stufken, 1993) fall out as a particular case. A list of ? N ? -point DBSP for sample size three is obtained for population size N ? 100, where N is odd. Keywords: Balanced sampling plans, Distance balanced sampling plans, Distance balanced incomplete block designs, Linear integer programming.