Journal Volume: 66      No.: 1     Year: 2012
S.No Title Abstract Download
1 Semiparametric Fay-Herriot Model using Penalized Splines
Author: C. Giusti, S. Marchetti, M. Pratesi and N. Salvati      Pages: 1-14
In this paper we propose a semiparametric Fay and Herriot area level model based on P-splines, which can handle situations where the functional form of the relationship between the variable of interest and the covariates is unknown. This is often the case when the data are supposed to be affected by spatial proximity effects. In these cases P-spline bivariate smoothing can easily introduce spatial effects in the area level model. By this spatial effect we can obtain estimates for out of sample areas and also for those areas where auxiliary information is unavailable. We focus here on the small area mean estimator and on an analytic and a bootstrap based mean squared error estimators. The proposed estimators of the small area means and mean squared errors are contrasted to the traditional ones by means of two simulations studies. We finally present results of the application of our semiparametric model to estimate the mean of the Acid Binding Capacity (ANC) and Calcium (CA) concentration in streams for each 8-digit Hydrologic Unit Code (HUC) within the Mid-Atlantic region of the US. ANC and CA concentration represent two of the key indicators to keep under control for environmental protection and preservation of natural resources. These results present evidence that the proposed estimators can be used to obtain accurate estimates in those areas where direct estimates are unreliable or even unavailable. Keywords : Small area methods, Semiparametric models, Bootstrap methods, Environmental data.
2 Small Area Estimation under Additivity Constraints to Published Direct Survey Estimates
Author: Daniel Elazar      Pages: 15-30
Users of small area estimates (SAEs) produced by national statistical offices often require that published SAEs are coherent with survey estimates or benchmark totals published at national or state levels. In some countries, this requirement is mandated by legislation. Ensuring the coherence of SAEs with published statistics not only helps reassure users about the quality of the SAEs, it can also help correct for model misspecification bias. In previous small area applications carried out at the Australian Bureau of Statistics (ABS), coherence with broader level published estimates was achieved after the small area estimation process by using a technique such as iterative proportional fitting. However, the disadvantages of this approach were firstly, the lack of integration with the small area estimation process itself, and secondly, the fact that the mean square error estimates did not take proper account of the constraints imposed. Keywords : Small area estimation, Logistic binomial, Random effects, Benchmarks, Constrained regression, Lagrange multipliers.
3 Allocating a Limited Budget to Small Areas
Author: Nicholas T. Longford      Pages: 31-41
We consider the general problem of allocating funds from a fixed budget to the districts of a country according to the values of a district-level indicator. The obvious approach might seem to be to efficiently estimate the relevant district-level quantities, and then to apply the allocation scheme that would be optimal if these quantities were equal to the estimates. We show that such a two-stage strategy is suboptimal. By a simulation approach, we find allocation schemes that are superior to this strategy. We offer no single universal solution, but motivate the results by intuition. We discuss the implications of our finding on the separation of the remits of a statistical agency and its client. Keywords : Composite estimation, Optimal allocation, Shrinkage, Small-area estimation.
4 Use of Spatial Information in Small Area Models for Unemployment Rate Estimation at Sub-provincial Areas in Italy
Author: Michele D Alo, Loredana Di Consiglio, Stefano Falorsi, M. Giovanna Ranalli and Fabrizio Solari      Pages: 43-53
The goal of this paper is to analyze the possibility to improve the performance of the estimation at sub-regional level from ISTAT Labour Force Survey. In particular, we refer to estimation of unemployment rates for small domains cutting across survey strata, i.e. Local Labour Market Areas, defined as aggregation of municipalities. Currently, such quantities are estimated by means of an EBLUP based on a linear mixed model with spatially correlated area effects and covariates given by the area level unemployment rate at previous census and sex by age classes. In this work we explore the use of alternative models to incorporate spatial information at different levels. In particular, we investigate the use of different distance measures in the correlation structure among small areas. In addition, additive models are employed to include spatial information at the municipality level using low-rank thin plate splines. Finally, small area estimators based on logistic (mixed) models are explored to account more properly for the binary nature of the response variable. Spatial information is included in this type of models too. The performance of the aforementioned methods is studied via simulation experiments on 2001 Census data. Keywords : Labour force survey, Mixed effects models, Generalized additive models, Thin plate splines, Distance measures.
5 Hierarchical Bayes Small Area Estimates of Adult Literacy using Unmatched Sampling and Linking Models
Author: Leyla Mohadjer, J.N.K. Rao, Benmei Liu, Tom Krenzke and Wendy Van de Kerckhove      Pages: 55-63
Funded by the National Center for Education Statistics, the National Assessment of Adult Literacy (NAAL) was designed to measure the English literacy skills of adults in the U.S. based on an assessment containing a series of literacy tasks completed by sampled adults. Sufficiently precise direct estimates have been produced for the nation and major domains of interest, using the NAAL data. However, policymakers and researchers/business leaders often need literacy information for states and counties but these areas do not have large enough samples to produce reliable estimates. Therefore, small area estimation techniques are used to produce model-based indirect estimates of literacy levels for all states and counties in the nation. This paper describes the Hierarchical Bayes estimation techniques used to produce both county and state estimates, and credible intervals, based on a single area-level linking model. Keywords : Indirect state and county estimates, Credible intervals, Variance smoothing, Predictor variable selection.
6 County Level Estimation using Data from the U.S. National Resources Inventory
Author: Pushpal K. Mukhopadhyay, Tapabrata Maiti and Wayne A. Fuller      Pages: 65-74
We use a transformed Fay-Herriot model to estimate wind erosion at the county level in Iowa. A soil erodibility index is available from administrative records for each county in Iowa and is used to form the predictor. The response variable is the soil loss due to wind as recorded in the 2002 National Resources Inventory. We propose bias corrected, and calibrated small area predictors such that the weighted sum of the county predictors matches the state level direct estimate. The standard errors are estimated by using a parametric double-bootstrap method. The small area predictions have estimated coefficients of variation that average about three fifth of those of the direct estimates. Keywords : Small area estimation, Wind erosion, Calibration, Double-Bootstrap, Soil erodibility index.
7 Two Area-Level Time Models for Estimating Small Area Poverty Indicators
Author: M.D. Esteban, D. Morales, A. Perez and L. Santamaria      Pages: 75-89
This paper deals with small area estimation of poverty indicators. Small area estimators of these quantities are derived from time-dependent area-level linear mixed models. As appropriate auxiliary variables are not always available in the survey data on living conditions, the proposed models using only aggregated data are a good alternative to the unit-level models. The mean squared errors are estimated by explicit formulas. Two simulation experiments designed to analyze the behavior of the introduced estimators are carried out. An application to real data from the Spanish Living Conditions Survey is also given. Keywords : Area-level models, Small area estimation, Time dependency, Poverty indicators.
8 A Modeling Approach for Uncertainty Assessment of Register-based Small Area Stastics
Author: L.C. Zhang and J. Fosen      Pages: 91-104
Statistical registers have great potentials when it comes to producing statistics at detailed spatial-demographic levels. However, population totals based on statistical registers are subjected to random variations that exist in the target population as well as errors that are associated with the registration (or measurement) process. While the former counts for heterogeneity across the areas (or domains), i.e. genuine ?signals? of interest, the latter ones are merely ?noises? in measurement. We propose a model-based sensitivity analysis approach, which allows us to distinguish between the different sources of randomness in the data, by which means the strength of the signals can be assessed against the noises. The data from the Norwegian Employer/ Employee register are used to demonstrate the existence of measurement noises in administrative data sources, and to illustrate the proposed approach. We believe that both the conceptualization of the random nature of the register data and the sensitivity analysis approach can be useful for assessing detailed statistics produced from statistical registers on various subjects. Keywords : Modeling, Register-based statistics, Measurement errors, Sensitivity analysis.
9 Small Area Estimation in Practice: An Application to Agricultural Business Survey Data
Author: Nikos Tzavidis, Ray Chambers, Nicola Salvati and Hukum Chandra      Pages: 213-288
This paper describes an application of small area estimation (SAE) to agricultural business survey data. Both well known small area estimators, such as the empirical best linear unbiased predictor (EBLUP), and more recently proposed small area estimators, for example, the M-quantile, the robust EBLUP and the Model Based Direct estimators are considered. Mean squared error estimation is discussed. Using a real agricultural business survey dataset, we place emphasis on model diagnostics for specifying the small area working model, on diagnostic measures for validating the reliability of direct and indirect (model- based) small area estimators and on providing practical guidelines to the prospective user of small area estimation techniques. Keywords : Diagnostics, Direct estimator, Model-based estimation, Outlier robustness, MSE estimation.
10 Back Titles
Author: ISAS      Pages: 4
11 Hindi Supplement
Author: ISAS      Pages: 229-237
12 Other Publications
Author: ISAS      Pages: 1
13 Preface
Author: ISAS      Pages: 1
14 Fast EB Method for Estimating Complex Poverty Indicators in Large Populations
Author: Caterina Ferreti and Isabel Molina       Pages: 105-120
This paper studies small area estimation of computationally complex poverty indicators; more concretely, we study fuzzy monetary and fuzzy supplementary poverty indicators. These two indicators do not need to set a poverty line because they are based on the degree of poverty of each individual relative to the population to which it belongs. Moreover, the latter takes into account the non-monetary and multidimensional nature of poverty. For this, a faster version of the empirical best/ bayes (EB) method of Molina and Rao (2010) is proposed. This new method allows feasible estimation of computationally complex indicators in large populations, and can still reduce considerably the computation time when the original EB method is feasible. In simulations, the proposed fast EB method is compared with the original EB method when estimating the mentioned indicators along with the poverty incidence in small areas. Results show negligible loss of efficiency of the fast EB method as compared to the original one, while allowing estimation of complex indicators that require sorting all population elements. The method is applied to the estimation of poverty indicators in the region of Tuscany, both at province and municipality levels, using data from the Italian Survey on Income and Living Conditions. Keywords : Empirical best estimator, Fuzzy poverty measures, Small area estimation.
15 Inferences on Small Area Populations
Author: Shijie Chen and P. Lahiri      Pages: 121-124
Design-based methods are generally inefficient for making inferences about small area proportions for rare events. In this paper, we discuss an alternative hierarchical model and the associated hierarchical Bayes methodology. Sufficient conditions for propriety of the posterior distributions of relevant parameters are presented. Keywords : Credible interval, MCMC, Rare event.
16 Small Area Poverty Estimation by Model Calibration
Author: Risto Lehtonem and Ari Veijanem      Pages: 125-133
Calibration techniques using auxiliary data offer efficient tools for design-based estimation of population totals and means. In linear or model-free calibration, the weights are calibrated to reproduce the known population totals of the auxiliary variables. A key property of model calibration is that the weights are calibrated to the population total of the predictions derived via a specified model. We introduce model calibration methods for estimation of poverty rate for domains and small areas and present some new semi-direct and semi-indirect calibration estimators. They benefit from spatial correlations of variables in a hierarchy of regions or spatial neighbourhoods. Our study variable is binary and we use logistic mixed models under unequal probability sampling. The properties (design bias and accuracy) of the estimators are compared with generalized regression estimators and Horvitz-Thompson type estimators by using simulation experiments with unit-level register data of Statistics Finland. Keywords : Small area estimation, Poverty rate, Spatial statistics, Mixed models.
17 Small Area Methods for Agricultural Data: A Two-Part Geoadditive Model to Estimate the Agrarian Region Level means of the Grapevines Production in Tuscany
Author: C. Bocci, A. Petrucci and E. Rocco      Pages: 135-144
In applications involving agricultural data, it is common to encounter semicontinuous variables that have a portion of values equal to zero and a continuous, often skewed, distribution among the remaining values. Moreover, these variables often show a spatial pattern. We develop a two-part geoadditive small area model that can deal with these issues. In particular, we are interested in predicting the mean of a target variable with these characteristics for a collection of subsets of the population. Direct estimation using only the survey data is inappropriate as it yields to estimates with unacceptable levels of precision. A study of the Tuscan Agrarian Region (Italy) level means of the grapevines production illustrates this method. Keywords : Generalized linear mixed model, Penalized splines, Semicontinuous data, Spatial dynamics, Zero-inflated data.
18 On the Influence of Sampling Design on Small Area Estimates
Author: Ralf T. Munnich and J. Pablo Burgard      Pages: 145-156
Recent advances in small area statistics applications raised the question on the influence of sampling designs on model based estimates. On the one hand, weighting was introduced in the modelling (cf. You and Rao 2002). On the other hand, Gelman (2007) argues that sampling designs with highly variable design weights should be avoided in order to support statistical modelling and especially Bayesian modelling. Keywords : Complex sampling designs, Monte Carlo study, Design effects, Gelman factor, Small area estimation.
19 Assessment of Zeroes in Survey-Estimated Tables via Small-Area Confidence Bounds
Author: Eric V. Slud      Pages: 157-169
Motivated by the problem of ?quality filtering? of estimated counts in U.S. American Community Survey tables, this paper studies methods for placing confidence bounds on zero estimates within demographically cross-classified tables which are estimated from complex surveys. While Coefficients of Variation are generally used in screening the quality of estimated counts, they do not make sense for assessing validity of zero counts. The problem of assessment is formulated here in terms of (upper) confidence bounds for unknown proportions. After summarizing published methods of constructing confidence intervals for proportions based on survey data, we study methods of creating confidence bounds from small-area models including synthetic, logistic, and variance-stabilized (arcsin square root transformed) linear models. The relations between these models and the confidence bounds they generate will be illustrated on demographic (Age-Race-Sex) American Community Survey tables from 2009 data for large (population at least 65000) U.S. Counties. Keywords : arcsin square-root transformation, Confidence bounds, Effective sample size, Fay-Herriot model, Quality filtering, Synthetic model.
20 Small Area Estimation for Policy Development: A Case Study of Child Undernutrition in Ghana
Author: Fiifi Amoako Johnson, Hukum Chandra, James J. Brown and Sabu S. Padmadas      Pages: 171-186
The demand for Small (local-level) Area Statistics has increased tremendously, particularly in countries where a decentralised approach to governance and service provision has been adopted. Most of these countries lack local-level statistics to aid policy decisions and planning. Sample surveys such as the Demographic and Health Survey provide a wide range of invaluable data at the national and regional level but cannot be used directly to produce reliable district-level estimates due to small sample sizes. This paper illustrates the application of Small Area Estimation (SAE) techniques to derive model-based district-level estimates of child undernutrition in Ghana linking data from the 2003 Ghana Demographic and Health Survey (GDHS) and the 2000 Ghana Population and Housing Census (GPHC). The diagnostics measures show that the model-based estimates are robust when compared to the direct survey estimates. The model-based estimates reveal considerable heterogeneity in the prevalence of undernutrition, with children living in the Northern part of the country being most disadvantaged. The estimates clearly highlight the districts where targeted child health interventions need to be strengthened. In countries where small area statistics are non-existent, SAE techniques could be crucial for designing effective policies and strengthening local- level governance. Keywords : Small area estimation, Child undernutrition, Ghana, Demographic and Health Survey, Population and Housing Census, Policy, Stunting, Underweight.
21 Labour Force Status Estimates under a Bivariate Random Components Model
Author: Ayoub Saei and Alan Taylor      Pages: 187-201
Models for a multi-category response data, labour force status, based on a generalized linear model specification typically assume that the regression coefficients to be varied with the response category. However, they can be extended to random components by allowing the area random effects to be depended on response category. In this paper, we describe a multinomial linear mixed model with a bivariate random component in estimating totals of the inactive, unemployed and employed people at Local Authority District (LAD) level. The random effects are assumed to follow a bivariate normal distribution. The model parameters including variance components and correlation coefficient are estimated by maximum and residual maximum likelihood methods. The estimated parameters and predicted values of the LAD (area) random effects are then used in calculating the empirical best linear unbiased-type estimates. The mean squared error estimates are obtained by using an analytical approximation approach. The application is the UK LFS data in Molina et al. (2007) and estimates are compared with the results in that paper. A simulation study demonstrates a good performance of the proposed model. Keywords : Bivariate, Category-specific, LFS, Maximum likelihood, Multinomial, Random component, REML.
22 Practical Guidelines for Design and Analysis of Sample Surveys for Small Area Estimation
Author: Stephen Haslett      Pages: 203-212
This paper provides practical guidelines for the design and analysis of sample surveys that are to be used for small area estimation using regression type methods. It is based on the author?s experience using small area estimation in a range of studies including small area modeling of employment and unemployment, small area estimation of poverty and small domain estimation of ethnicity, in a range of countries including USA, UK, Bangladesh, Philippines, Nepal, Cambodia, and New Zealand, and for feasibility studies in Bhutan and Timor-Leste. The importance of recognising at design stage that one of the uses of the survey data will or may be small area estimation, and identifying all the parameters that will require estimation (including variance components) are discussed, as are issues of clarity of aim, data availability, and model choice at the analysis stage. Keywords : Data cleaning, Feasibility assessment, Modelling, Multiple goals, Poverty mapping, Variance components.