Skip to main content
Erschienen in: Journal of Urban Health 1/2024

02.01.2024 | Original Article

Designing and Evaluating a Hierarchical Framework for Matching Food Outlets across Multi-sourced Geospatial Datasets: a Case Study of San Diego County

verfasst von: Yanjia Cao, Jiue-An Yang, Atsushi Nara, Marta M. Jankowska

Erschienen in: Journal of Urban Health | Ausgabe 1/2024

Einloggen, um Zugang zu erhalten

Abstract

Research on retail food environment (RFE) relies on data availability and accuracy. However, the discrepancies in RFE datasets may lead to imprecision when measuring association with health outcomes. In this research, we present a two-tier hierarchical point of interest (POI) matching framework to compare and triangulate food outlets across multiple geospatial data sources. Two matching parameters were used including the geodesic distance between businesses and the similarity of business names according to Levenshtein distance (LD) and Double Metaphone (DM). Sensitivity analysis was conducted to determine thresholds of matching parameters. Our Tier 1 matching used more restricted parameters to generate high confidence-matched POIs, whereas in Tier 2 we opted for relaxed matching parameters and applied a weighted multi-attribute model on the previously unmatched records. Our case study in San Diego County, California used government, commercial, and crowdsourced data and returned 20.2% matched records from Tier 1 and 18.6% matched from Tier 2. Our manual validation shows a 100% matching rate for Tier 1 and up to 30.6% for Tier 2. Matched and unmatched records from Tier 1 were further analyzed for spatial patterns and categorical differences. Our hierarchical POI matching framework generated highly confident food POIs by conflating datasets and identified some food POIs that are unique to specific data sources. Triangulating RFE data can reduce uncertain and invalid POI listings when representing food environment using multiple data sources. Studies investigating associations between food environment and health outcomes may benefit from improved quality of RFE.
Anhänge
Nur mit Berechtigung zugänglich
Literatur
12.
Zurück zum Zitat Smith LG, Widener MJ, Liu B, et al. Comparing household and individual measures of access through a food environment lens: what household food opportunities are missed when measuring access to food retail at the individual level? Ann Am Assoc Geogr. 2021;0(0):1–21. https://doi.org/10.1080/24694452.2021.1930513 Smith LG, Widener MJ, Liu B, et al. Comparing household and individual measures of access through a food environment lens: what household food opportunities are missed when measuring access to food retail at the individual level? Ann Am Assoc Geogr. 2021;0(0):1–21. https://​doi.​org/​10.​1080/​24694452.​2021.​1930513
27.
28.
Zurück zum Zitat Lamprianidis G, Skoutas D, Papatheodorou G, Pfoser D. Extraction, integration and analysis of crowdsourced points of interest from multiple web sources. GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems. 2014;2014-Novem(November):16–23. https://doi.org/10.1145/2676440.2676445 Lamprianidis G, Skoutas D, Papatheodorou G, Pfoser D. Extraction, integration and analysis of crowdsourced points of interest from multiple web sources. GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems. 2014;2014-Novem(November):16–23. https://​doi.​org/​10.​1145/​2676440.​2676445
42.
Zurück zum Zitat Levenshtein VI. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady. 1965;10(8):707–10.ADSMathSciNet Levenshtein VI. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady. 1965;10(8):707–10.ADSMathSciNet
43.
Zurück zum Zitat Philips L. The double metaphone search algorithm. Cc Plus Plus Users J. 2000;18(6):38–43. Philips L. The double metaphone search algorithm. Cc Plus Plus Users J. 2000;18(6):38–43.
Metadaten
Titel
Designing and Evaluating a Hierarchical Framework for Matching Food Outlets across Multi-sourced Geospatial Datasets: a Case Study of San Diego County
verfasst von
Yanjia Cao
Jiue-An Yang
Atsushi Nara
Marta M. Jankowska
Publikationsdatum
02.01.2024
Verlag
Springer US
Erschienen in
Journal of Urban Health / Ausgabe 1/2024
Print ISSN: 1099-3460
Elektronische ISSN: 1468-2869
DOI
https://doi.org/10.1007/s11524-023-00817-9

Weitere Artikel der Ausgabe 1/2024

Journal of Urban Health 1/2024 Zur Ausgabe