From Big Noise to Big Data: Toward the Verification of Large Data sets for Understanding Regional Retail Flows

Robin Lovelace, Mark Birkin, Philip Cross, and Martin Clarke (2016). From Big Noise to Big Data: Toward the Verification of Large Data sets for Understanding Regional Retail Flows. Geographical Analysis. https://doi.org/10.1111/gean.12081
Authors

Robin Lovelace

Mark Birkin

Philip Cross

Martin Clarke

Published

January 1, 2016

Doi
Abstract
There has been much excitement among quantitative geographers about newly available data sets, characterized by high volume, velocity, and variety. This phenomenon is often labeled as “Big Data” and has contributed to methodological and empirical advances, particularly in the areas of visualization and analysis of social networks. However, a fourth vveracity (or lack thereof)has been conspicuously lacking from the literature. This article sets out to test the potential for verifying large data sets. It does this by cross-comparing three unrelated estimates of retail flowshuman movements from home locations to shopping centersderived from the following geo-coded sources: (1) a major mobile telephone service provider; (2) a commercial consumer survey; and (3) geotagged Twitter messages. Three spatial interaction models also provided estimates of flow: constrained and unconstrained versions of the “gravity model” and the recently developed “radiation model.” We found positive relationships between all data-based and theoretical sources of estimated retail flows. Based on the analysis, the mobile telephone data fitted the modeled flows and consumer survey data closely, while flows obtained directly from the Twitter data diverged from other sources. The research highlights the importance of verification in flow data derived from new sources and demonstrates methods for achieving this.

Type: Journal Article Venue: Geographical Analysis Year: 2016

DOI BibTeX

Abstract

There has been much excitement among quantitative geographers about newly available data sets, characterized by high volume, velocity, and variety. This phenomenon is often labeled as “Big Data” and has contributed to methodological and empirical advances, particularly in the areas of visualization and analysis of social networks. However, a fourth vveracity (or lack thereof)has been conspicuously lacking from the literature. This article sets out to test the potential for verifying large data sets. It does this by cross-comparing three unrelated estimates of retail flowshuman movements from home locations to shopping centersderived from the following geo-coded sources: (1) a major mobile telephone service provider; (2) a commercial consumer survey; and (3) geotagged Twitter messages. Three spatial interaction models also provided estimates of flow: constrained and unconstrained versions of the “gravity model” and the recently developed “radiation model.” We found positive relationships between all data-based and theoretical sources of estimated retail flows. Based on the analysis, the mobile telephone data fitted the modeled flows and consumer survey data closely, while flows obtained directly from the Twitter data diverged from other sources. The research highlights the importance of verification in flow data derived from new sources and demonstrates methods for achieving this.

Citation

Robin Lovelace, Mark Birkin, Philip Cross, and Martin Clarke (2016). From Big Noise to Big Data: Toward the Verification of Large Data sets for Understanding Regional Retail Flows. Geographical Analysis. https://doi.org/10.1111/gean.12081

BibTeX

@article{lovelace_big_2016,
  title = {From {{Big Noise}} to {{Big Data}}: {{Toward}} the {{Verification}} of {{Large Data}} Sets for {{Understanding Regional Retail Flows}}},
  shorttitle = {From {{Big Noise}} to {{Big Data}}},
  author = {Lovelace, Robin and Birkin, Mark and Cross, Philip and Clarke, Martin},
  year = {2016},
  month = jan,
  journal = {Geographical Analysis},
  volume = {48},
  number = {1},
  pages = {59--81},
  issn = {1538-4632},
  doi = {10.1111/gean.12081},
  urldate = {2016-03-15},
  abstract = {There has been much excitement among quantitative geographers about newly available data sets, characterized by high volume, velocity, and variety. This phenomenon is often labeled as ``Big Data'' and has contributed to methodological and empirical advances, particularly in the areas of visualization and analysis of social networks. However, a fourth v{\textemdash}veracity (or lack thereof){\textemdash}has been conspicuously lacking from the literature. This article sets out to test the potential for verifying large data sets. It does this by cross-comparing three unrelated estimates of retail flows{\textemdash}human movements from home locations to shopping centers{\textemdash}derived from the following geo-coded sources: (1) a major mobile telephone service provider; (2) a commercial consumer survey; and (3) geotagged Twitter messages. Three spatial interaction models also provided estimates of flow: constrained and unconstrained versions of the ``gravity model'' and the recently developed ``radiation model.'' We found positive relationships between all data-based and theoretical sources of estimated retail flows. Based on the analysis, the mobile telephone data fitted the modeled flows and consumer survey data closely, while flows obtained directly from the Twitter data diverged from other sources. The research highlights the importance of verification in flow data derived from new sources and demonstrates methods for achieving this.},
  copyright = {{\textcopyright} 2015 The Authors. Geographical Analysis published by Wiley Periodicals, Inc. on behalf of The Ohio State University, This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.},
  langid = {english},
  keywords = {geographical analysis},
  file = {/home/robin/Zotero/storage/BWWWRJFA/Lovelace et al. - 2016 - From Big Noise to Big Data Toward the Verificatio.pdf;/home/robin/Zotero/storage/VWRM3P8H/abstract.html}
}

Notes