Dave Helmers and collaborators are using pure Open Source tools (GDAL and NumPY) in an NSF funded study of future landuse change across the U.S. Their research deals with the coterminous USA, and has generated a wealth of data and maps of potential future land use patterns across the United States.
Understanding future landcover through changes in landuse is important for a variety of management, planning and scientific endeavors. An NSF funded project team, lead by Steve Polasky from the University of Minnesota, and including Andrew Plantinga from Oregon State University, Dave Lewis from Puget Sound University, Erik Nelson from Bowdoin University, and Josh Lawler from the University of Washington,is applying econometric models to predict changes in landuse, looking at how forests, urban, cropland, grassland, rangeland may change over time. Results of the project are going to be used as inputs to models that simulate future biodiversity and carbon sequestration.This project uses the 2001 National Land Cover Dataset (NLCD) as a starting point of current land cover and applies an econometric model, developed at Oregon State University by Andrew Plantinga et al. to simulate future landcover in 25 and 50 years. The model takes in a range of probabilities of transitions between classes, which are tuned according to 5 different scenarios, such as paying people to reforest – afforestation subsidies, or getting rid of agricultural subsidies.
With source data resolution of 100 m and a large area covering the whole coterminous USA, the size of the input datasets and processing time are major challenges, and efficient scalable instruments need to be used to perform the simulations in a reasonable timeframe. For example, the state of Wisconsin alone occupies an area represented by 310 million pixels.After trying different options, the processing team led by Dave Helmers made their decision to use Open Source geodata processing software: GDAL (http://gdal.org) and NumPy (http://numpy.scipy.org) in Python. Processing of a single scenario for the entire U.S. now takes on roughly 4 days and generates 1.2 Tb of data, including 1000 maps for each scenario, which are compared to assess how robust the predictions are. The ability of n-dimensional array processing coupled with the existence of good documentation and other users experiences allowed Dave to conduct the simulations in a relatively short timeframe. He concluded, that, though certainly experience with programming and command line processing are essential, these tools are easily mature enough to be valuable for data-intensive analyses of large spatial datasets.”
Story by Maxim Dubinin