Estimating the number of people eligible for health service use

https://doi.org/10.1016/S0149-7189(02)00002-2Get rights and content

Abstract

In this journal Dewit and Rush [Evaluat Prog Plan 19 (1996) 41–64] suggested that truncated Poisson (tP) estimators could be used to estimate the size of a potential clientele of a locally operating health service. The estimates may be used for evaluative purposes, e.g. unmet needs assessment and health service performance evaluation. Here, we illustrate how tP-estimators can be applied. We used two estimators [Chao, Biometrics 43 (1987) 783–791; Zelterman, J Stat Plan Infer 18 (1988) 225–237] to estimate the total number of potential clients making use of a facility for homeless people. Data-collection was carried out within a single week. Only the frequency of visits by homeless people to the facility was used in the analysis. A number of assumptions must hold for the estimates to be valid. We show how some of the assumptions can be checked.

Introduction

Utrecht is a Dutch city with 230,000 inhabitants. In this city, a converted city bus is operated where free breakfasts, late suppers and blankets are on offer for homeless people. This facility, called the Tussenbus, is open when other day and night facilities for homeless people are closed. On average, the Tussenbus receives 50–60 visitors in early mornings and 30–40 visitors at night. However, what is not known is the total number of potential clients. Consequently, it is not known how successful the Tussenbus is in reaching homeless people. The objective of this study is to estimate the total number of potential clients and to compare this to the number actually reached. The City Council of Utrecht required this information for their decision to continue or discontinue financing the Tussenbus.

We present this study as a case, because it illustrates in a concise way what can be achieved with a statistical technique called truncated Poisson (tP) modeling. Like other capture–recapture (CRC) techniques (Dewit & Rush, 1996) tP-models can be used to estimate the unknown size of a hidden population such as homeless people, criminals, prostitutes, and drug addicts. This can be done in the absence of a sample-frame, or when a community-based survey would be too costly because the population of interest is relatively small compared to the lager population (e.g. the number of HIV infected people within the general population). CRC and tP-models are, of course, not restricted to homeless people, but can also be useful in fields like criminology (e.g. estimating the number of criminals in a community) and epidemiology (e.g. estimating the number of HIV infected people). Information on the size of a population can be useful for a number of reasons, e.g. unmet needs assessment, allocation of resources, agenda setting, health service performance evaluation. However, most CRC-techniques require two or more samples and come with a series of assumptions that are difficult to meet, whereas tP-models are computationally easy, use only one sample, and are based on relatively few assumptions, most of which can be checked or handled data-analytically. These features of the tP-model will be illustrated and explained in this paper.

There is a whole array of CRC-techniques to estimate the unknown size of a population, see Pollock, 1991, Seber, 1982, Seber, 1986, Seber, 1992 for general reviews, Wickens (1993) for a review with applications to drug-using populations, and Hook and Regal (1995) for epidemiological applications. Usually CRC-techniques require two or more, partially overlapping, preferably independent registers of the population of interest. Other estimators only need data from a single register (see for a review Wilson & Collins, 1992). These estimators are commonly known as tP-models. We will employ one tP-estimator which was proposed by Anne Chao in 1987 and another derived by Daniel Zelterman in 1988. Since these estimators make use of only a single register, they are particularly useful for estimating the size of the clientele of a single service provider.

To our knowledge, tP-estimators have not been used to estimate the size of a homeless population (Darcy and Jones, 1975, Fisher et al., 1994, Koegel et al., 1996, Shaw et al., 1996), but their undemanding data requirements and their relatively relaxed assumptions warrant a much stronger interest. In Section 2, we will describe how the data were collected and how the analysis was carried out. Results will be given and discussed with respect to the underlying assumptions of the tP-model. This paper will be concluded with some notes on lessons that were learned in this and related studies.

Section snippets

Method

During the data-collection week, the number of visits made by each visitor was tallied. To ensure that every visit made by each individual was counted, all visitors were approached. They were approached individually and at a convenient moment and then requested to give their date of birth and first two letters of their surname. In addition, the sex of each visitor was recorded. From the tally, a frequency distribution was obtained of the number of people who were 1,2,…,K time visitors, with the

Results

In the week in which data-collection was carried out we tallied 162 different visitors to the Tussenbus. Of these, 63 were seen once, 20 twice and the remainder were seen three times or more. Under Chao's model, we estimated the total number of the clientele as 261 within a 95% confidence interval ranging from 213 to 356. Under Zelterman's model we obtained est(N)=345 within a 95% CI of 278–455. The confidence intervals show overlap and therefore we conclude that both estimators are in

References (18)

  • D.J. Dewit et al.

    Assessing the need for substance abuse services: A critical review of needs assessment models

    Evaluation and Program Planning

    (1996)
  • D. Zelterman

    Robust estimation in truncated discrete distributions with application to capture–recapture experiments

    Journal of Statistical Planning and Inference

    (1988)
  • Bustami, R., Van der Heijden, P., Van Houwelingen, H., Engbersen G. (2001). Point and interval estimation of the...
  • A. Chao

    Estimating the population size for capture–recapture data with unequal catchability

    Biometrics

    (1987)
  • A. Chao

    Estimating population size for sparse data in capture–recapture experiments

    Biometrics

    (1989)
  • L. Darcy et al.

    The size of the homeless men population of Sydney

    Australian Journal of Social Issues

    (1975)
  • N. Fisher et al.

    Estimating the numbers of homeless and homeless mentally ill people in north east Westminster by using capture–recapture analysis

    British Medical Journal

    (1994)
  • M. Frischer et al.

    A comparison of different methods for estimating the prevalence of problematic drug misuse in Great Britain

    Addiction

    (2001)
  • E.B. Hook et al.

    Capture–recapture methods in epidemiology: Methods and limitations

    Epidemiologic Reviews

    (1995)
There are more references available in the full text version of this article.

Cited by (15)

  • On the Chao and Zelterman estimators in a binomial mixture model

    2015, Statistical Methodology
    Citation Excerpt :

    For instance, a household can serve as a useful unit of disease surveillance, and a binomial mixture model can arise by assuming that the number of disease cases in a household is binomial and that the probability that one person is infected is allowed to vary over households [1,16,14,15,8–10]. There are various epidemiological applications of the binomial mixture model (e.g., [6,18,19]). We will use the nonparametric binomial mixture model.

  • An extension of Chao's estimator of population size based on the first three capture frequency counts

    2011, Computational Statistics and Data Analysis
    Citation Excerpt :

    The origin of capture–recapture modelling goes back to Petersen and Lincoln (Seber, 2002), who used the independent information of two identifying sources or lists to construct an estimator of population size. Capture–recapture models currently tend to be generally applied in a variety of applications including estimation of the size of a human target population, usually defined by a specific disease experiencing potential severe undercount (e.g. Böhning et al., 2004; Corrao et al., 2000; Gallay et al., 2000; Hay et al., 2009; Hook and Regal, 1995; Nardone et al., 2003; Smit et al., 2002; van Hest et al., 2008), as well as estimation of an elusive target population in the social sciences such as illegal gun owners or car drivers without licence (e.g. Carothers, 1973; Chang et al., 1999; Hay, 1997; Hope et al., 2005; van der Heijden et al., 2003a,b). The next result compares the asymptotic biases for the new and Chao’s estimator.

View all citing articles on Scopus
View full text