Volunteered geographic information in Wikipedia
by Hardy, Darren, Ph.D., UNIVERSITY OF CALIFORNIA, SANTA BARBARA, 2010, 259 pages; 3439644

Abstract:

Volunteered geographic information (VGI) refers to the geographic subset of online user-generated content. Through Geobrowsers and online mapping services, which use geovisualization and Web technologies to share and produce VGI, a global digital commons of geographic information has emerged. A notable example is Wikipedia, an online collaborative encyclopedia where anyone can edit articles, including those about place. Wikipedia’s editorial transparency and integration with online mapping services make it well suited for studying VGI production.

My dissertation contributes empirical evidence and quantitative methods to an emerging area of study—the efficacy and use of VGI—and I focus on spatial behavior in VGI production, which is largely unknown. In particular, the capacity of a ubiquitous Internet to reduce communication costs has raised questions of whether geographic distance matters in information and economic production. My research tests whether proximity matters in VGI production. That is, do VGI contributors write about nearby places? Moreover, what are suitable research methods to study large-scale VGI system with millions of contributors like Wikipedia?

For my study, I collect a corpus of 32 million contributions to 1 million geotagged Wikipedia articles over 7 years (2001–2008). I use data mining and IP geolocation methods on the corpus to select a sample dataset that includes 7.3 million contributions by 2.8 million anonymous contributors to 0.4 million geotagged articles in 21 languages. To measure the proximity effect between articles and contributors, I develop a “signature distance” metric, which is a weighted average of distances between author and article. To model spatial interaction behaviors of contributors, I use a probabilistic invariant gravity model with an exponential distance decay function. My primary findings indicate that anonymous contributors write about nearby places, and that the influence of proximity decays exponentially and varies categorically.

Keywords. Distance decay; Geotagging; User-generated content; Volunteered geographic information; Wikipedia.

 
AdviserJames Frew
SchoolUNIVERSITY OF CALIFORNIA, SANTA BARBARA
SourceDAI/B 72-03, p. , Feb 2011
Source TypeDissertation
SubjectsGeographic information science and geodesy; Environmental studies; Information technology
Publication Number3439644
Adobe PDF Access the complete dissertation:
 

» This is an open access dissertation.
  Use the link below to access the full text PDF of this graduate work:
  http://gradworks.umi.com/3439644.pdf
  Use the link below to search and retrieve all open access dissertations:
  http://pqdtopen.proquest.com

About ProQuest Dissertations & Theses
With over 2.3 million records, the ProQuest Dissertations & Theses (PQDT) database is the most comprehensive collection of dissertations and theses in the world. It is the database of record for graduate research.

The database includes citations of graduate works ranging from the first U.S. dissertation, accepted in 1861, to those accepted as recently as last semester. Of the 2.3 million graduate works included in the database, ProQuest offers more than 1.9 million in full text formats. Of those, over 860,000 are available in PDF format. More than 60,000 dissertations and theses are added to the database each year.

If you have questions, please feel free to visit the ProQuest Web site - http://www.proquest.com - or call ProQuest Hotline Customer Support at 1-800-521-3042.