Elsevier

Health & Place

Volume 29, September 2014, Pages 79-83
Health & Place

A call for caution and transparency in the calculation of land use mix: Measurement bias in the estimation of associations between land use mix and physical activity

https://doi.org/10.1016/j.healthplace.2014.06.002Get rights and content

Abstract

There is evidence that land use mix based on the Shannon (1948) entropy formula may be misspecified in some studies. The aim of this study was to quantify the bias arising from this misspecification. Spatial coordinates were obtained from Statistics Canada for 9348 unique point locations. Five hundred-metre polygon-based network buffers were drawn around each coordinate (ArcGIS 10.1). Land use mix was calculated for each buffer using the true and misspecified land use mix formulas. Linear regression models were used to estimate the associations between a simulated dataset of daily steps and the true and misspecified measures. Misspecification of the land use mix formula resulted in a systematic underestimation of the true association by 26.4% (95% CI 25.8–27.0%). To minimize measurement bias in future studies, researchers are encouraged to use a constant definition of N in the denominator of the Shannon entropy formula.

Introduction

In the last decade there has been an increase in the number of studies conducted on the associations between neighbourhood designs and physical activity (Ding and Gebel, 2012, Feng et al., 2010). Recent reviews have highlighted inconsistencies across studies, with variability in demonstrated effects (Feng et al., 2010, McCormack and Shiell, 2011, Ferdinand et al., 2012). While the important contributory factors that constitute walkability are conceptually well-defined, inconsistencies in their associations with physical activity may be partly attributable to differences in walkability measurement and computation of indices (Hess et al., 2001, Brownson et al., 2009). One example of this is in the current method of calculating land use mix – a component of walkability.

Land use mix is a measure of the diversity of land uses contained in a neighbourhood (Leslie et al., 2007, Manaugh and Kreider, 2013). While many studies suggest that higher land use mix is associated with higher levels of physical activity, others suggest null effects (Feng et al., 2010, McCormack and Shiell, 2011, Grasser et al., 2013). In the neighbourhoods and health literature, land use mix is most commonly calculated using a variation of an entropy formula introduced in 1948 by Claude E. Shannon as part of his work on the mathematical theory of communication (Shannon, 1948). It is defined as (−∑k(pk ln pk))/ln N, where p is the proportion of land area within a predefined geographical zone devoted to a specific land use and N is the total number of land use categories (Leslie et al., 2007, Manaugh and Kreider, 2013). The resulting values range from 0 to 1 where 0 represents complete homogeneity and 1 represents complete heterogeneity in land uses within a neighbourhood.

When calculating land use mix via the Shannon entropy formula, the value of N should remain constant. Misspecification of the entropy formula arises when N is defined as the number of land uses that fall into each neighbourhood buffer (i.e., variable for each neighbourhood). This is problematic as it results in an overestimation of land use mix in some neighbourhoods and does not allow for meaningful comparisons of land use mix within a study. Take, for example two hypothetical neighbourhoods, for simplicity defined here by polygonal buffers around two home addresses (Fig. 1). Neighbourhood A is comprised of two types of land uses (i.e., residential and commercial) while Neighbourhood B is comprised of three types of land uses (i.e., residential, commercial and governmental). Assuming that there are three land uses of interest in total, Neighbourhood A should have an entropy score less than 1, and Neighbourhood B should have an entropy score equal to 1 (i.e., the most amount of diversity in land uses possible given three land uses of interest). However, when N is defined as the number of land uses in each buffer (i.e., 2 for Neighbourhood A; 3 for Neighbourhood B), the resulting entropy scores for Neighbourhoods A and B are both 1 – an overestimation of the land use mix in Neighbourhood A.1 It is only when N is constant and equivalent to the total number of land uses of interest (i.e., 3) that meaningful comparisons of land use mix can be made across neighbourhoods within a study. In this example, use of a constant N results in entropy scores of 0.63 and 1 for Neighbourhoods A and B, respectively2 – a more accurate reflection of the diversity of land uses in each of the neighbourhoods.

While previous studies may have used a constant definition of N (Frank et al., 2004, Hajna et al., 2013, Frank et al., 2007, Coffee et al., 2013), because N has not been explicitly defined in some studies and there is evidence that a variable definition of N may have been used (Frank et al., 2005, Frank et al., 2006), the possibility of exposure misclassification arising from the incorrect calculation of entropy cannot be ignored. The objective of this study was to quantify the amount of bias arising from using a variable definition of N in the Shannon entropy formula and to argue that careful consideration of how the entropy score is calculated is required in future studies.

Section snippets

Data

Anonymized spatial coordinates were obtained from Statistics Canada for 9348 unique point locations from across Canada. The point locations corresponded to the postal code addresses of individuals who participated in Cycle 1 (2007–2009) of the Canadian Health Measures Survey. The Canadian Measures Survey is a survey conducted on a nationally representative sample of Canadians aged 6–79 years. Individuals living on reserves, in institutions and full-time members of the Canadian Forces were not

Results

The average value for LUMconstant was 0.21 with a standard deviation (SD) of 0.22. This was comparable to the average value for LUMvariable (0.28, SD=0.28). Ninety-one per cent (91.1%) of the neighbourhoods were located in urban centres. Zero, one, two, three and four of the land uses of interest were contained in 11.1%, 25.9%, 24.2%, 21.8% and 17.1% of the neighbourhood buffers, respectively.

The mean difference between LUMvariable and LUMconstant (0.07, 95% CI −0.11 to 0.07) represented a

Discussion

While many studies assess land use mix via the Shannon entropy formula (Hess et al., 2001, Leslie et al., 2007, Coffee et al., 2013, Cervero, 1997, Duncan et al., 2010, Muller-Riemenschneider et al., 2013), few provide a clear definition of the denominator that is used in the calculation of this score. Given that these studies serve as guides for other researchers and there is evidence that the entropy formula may have been previously misspecified (Frank et al., 2005), it is important to

Acknowledgements

The authors thank Paul A. Peters for deriving the spatial location data and Patrick Bélisle for his assistance with programming. SH is supported by a Canadian Institutes of Health Research (CIHR) Doctoral Research Award. KD is supported through the Fonds de Recherché Sante Québec-Société Québécoise d’Hypertension Arterielle (SQHA)-Jacques de Champlain Clinician Scientist Award. NR was supported by Fonds de Recherché Santé Québec (FRSQ) Career Award (Chercheurs Boursier Health and Society).

References (32)

  • H.E. Christian et al.

    How important is the land use mix measure in understanding walking behaviour? Results from the RESIDE study

    Int. J. Behav. Nutr. Phys. Act.

    (2011)
  • R.C. Colley et al.

    Physical activity of Canadian adults: accelerometer results from the 2007 to 2009 Canadian Health Measures Survey

    Health Rep.

    (2011)
  • M.J. Duncan et al.

    Relationships of land use mix with walking for transport: do land uses and geographical scale matter?

    J. Urban Health

    (2010)
  • T. Dwyer et al.

    Association of change in daily step count over five years with insulin sensitivity and adiposity: population based cohort study

    Br. Med. J.

    (2011)
  • A.O. Ferdinand et al.

    The relationship between built environments and physical activity: a systematic review

    Am. J. Public Health

    (2012)
  • L. Frank et al.

    Many pathways from land use to health: associations between neighborhood walkability and active transportation, body mass index, and air quality

    J. Am. Plan. Assoc.

    (2006)
  • Cited by (0)

    View full text