You are here

Harvard Forest Data Archive


Regional Distribution and Abundance of Eastern Hemlock in Eastern North America 2010

Related Publications


  • hf191-01: raster map of modeled hemlock abundance


  • Lead: Matt Fitzpatrick, Aaron Ellison, Evan Preisser
  • Investigators: Joseph Elkinton, Adam Porter
  • Contact: Information Manager
  • Start date: 2010
  • End date: 2010
  • Status: completed
  • Location: Eastern North America
  • Latitude: +33.6 to +48.3
  • Longitude: -93.7 to -59.7
  • Elevation: 0 to 2037 meter
  • Taxa: Adelges tsugae (hemlock woolly adelgid), Tsuga canadensis (eastern hemlock)
  • Release date: 2011
  • Revisions:
  • EML file: knb-lter-hfr.191.6
  • DOI: digital object identifier
  • EDI: data package
  • DataONE: data package
  • Related links:
  • Study type: modeling
  • Research topic: ecological informatics and modelling; invasive plants, pests and pathogens; regional studies
  • LTER core area: disturbance, populations
  • Keywords: abundance, distribution, hemlock, hemlock woolly adelgid, region
  • Abstract:

    We developed comprehensive maps on the distribution and abundance of hemlock for the purposes of mapping the host distribution, and modeling the spread, of the hemlock woolly adelgid. Multiple statistical models were used to map the distribution of hemlock. Hemlock occurrence data were taken from the FIA database and multiple environmental predictors were gathered from various databases as described in methods. The raster map depicts the predicted abundance of hemlock m2 basal area per hectare) across its range in eastern North America.

  • Methods:

    The following description of methods is quoted from: Fitzpatrick MC, Preisser EL, Porter A, Elkinton J, Ellison AE (in press) Modeling range dynamics in heterogeneous landscapes: Invasion of the hemlock woolly adelgid in eastern North America. Ecological Applications.

    We used state-of-the-art species distribution modeling techniques to generate a map of hemlock distribution and abundance. We used nine statistical algorithms available within the BIOMOD framework in the statistical language R to relate occurrence of hemlock to 26 environmental predictor variables. Occurrence data were comprised of 16,084 points from the USDA Forest Inventory and Analysis (FIA) database. Environment predictors included 23 bioclimatic variables describing minimum, maximum, and seasonality in temperature, precipitation and water balance, two topographic variables (slope and compound topography index) from the USGS HYDRO1k dataset, and an index of net primary productivity. All variables were manipulated in ArcGIS 9.3 such that they were spatially congruent, had a common resolution of 1 km, and were projected using an equidistance conic projection to preserve distance characteristics between locations. Within BIOMOD, the occurrence data were randomly divided into ten different calibration (70%) and evaluation (30%) datasets, models were developed and evaluated for each of the ten datasets, and the outcomes of the ten runs were averaged, thus ensuring the final evaluation was somewhat independent of a particular realization of a random split of the occurrence data. Finally, we combined the averaged model from each of the nine algorithms into a single consensus prediction using techniques from ensemble forecasting. The true skill statistic was used to inform a weighted average of individual model contribution to the ensemble. This consensus model was projected across eastern North America, including Canada where we did not have data on hemlock occurrence. The result was a continuous map of the probability of hemlock occurrence at all locations within the map.

    Hemlock is patchily distributed across the landscape and does not occur in every suitable location. We thus removed a portion of cells by sub-sampling the probability of occurrence map. The probability of a cell remaining in the map was equal to the predicted probability of hemlock occurring in that cell. This approach produced a realistic, heterogeneous map in which most but not all of the low probability sites were removed and where many of the high probability sites remained.

    To produce a map of hemlock abundance we used the randomForests algorithm in R 2.9.1 to relate observed hemlock abundance (basal area, m2 ha-1) from the FIA database to the same 26 predictor variables used to model probability of hemlock occurrence. We used the resulting model to predict hemlock abundance across eastern North America, but only in cells remaining in the sub-sampled probability of occurrence map. To account for the fact that most cells were not 100% forested, we multiplied the map of hemlock abundance by a corresponding remotely sensed estimate of percent forest cover. The result was a map of hemlock abundance adjusted for forest cover that corresponds well with its known distribution and abundance.

  • Use:

    This dataset is released to the public under Creative Commons CC0 1.0 (No Rights Reserved). Please keep the dataset creators informed of any plans to use the dataset. Consultation with the original investigators is strongly encouraged. Publications and data products that make use of the dataset should include proper acknowledgement.

  • Citation:

    Fitzpatrick M, Ellison A, Preisser E. 2011. Regional Distribution and Abundance of Eastern Hemlock in Eastern North America 2010. Harvard Forest Data Archive: HF191 (v.6).

Detailed Metadata

hf191-01: raster map of modeled hemlock abundance

  • Compression: none
  • Format: tiff
  • Type: tiff