# Load libraries in this specific order to avoid conflicts
library(terra) # for analyzing rasterized products
library(data.table) # fast data manipulation
library(viridis) # plotting colors
library(ggplot2) # plotting
library(patchwork) # plotting layoutALS Processing Tutorial - Part II
This tutorial requires pre-processed ALS raster products. The data are available here. Ensure you have:
products/dtms_8km2_12scans.tifproducts/dsms_8km2_12scans.tifproducts/chms_8km2_12scans.tif
1 Introduction
This is a Notebook for a tutorial on ALS processing in R (Part II). In this part we analyze forest structural changes with ALS-derived rasters. The idea is for each group to experiment with different datasets.
1.1 Setup and Loading Packages
2 Sample Data: ALS Point Clouds from G-LiHT, NEON, 3DEP at Harvard Forest
As before, we will focus on Harvard Forest, but now the full 8 km², and all of them pre-processed into raster files. Note that we used a maximally robust CHM algorithm. That means it is calibrated to not be influenced by pulse density differences or scan angle differences, among other things.
There are 12 scans available:
- 1 scan from 3DEP in winter 2013/2014
- 3 scans from GLiHT (2012, 2017, 2021)
- 8 scans from NEON
2.1 Loading Pre-processed Rasters
# read in rasters
dtms = rast("products/dtms_8km2_12scans.tif")
dsms = rast("products/dsms_8km2_12scans.tif")
chms = rast("products/chms_8km2_12scans.tif")
# plot DTMs
terra::plot(dtms, col = viridis(100, option = "turbo"), main = "Digital Terrain Models")# plot CHMs
terra::plot(chms, col = viridis(100), main = "Canopy Height Models")We can also zoom in to examine specific areas:
# zoom in
ext_full = ext(chms)
ext_reduced = ext(731200, 731500, 4713900, 4714200)
chms_reduced = crop(chms, ext_reduced)
# select only a few NEON scans
chms_NEON = subset(chms_reduced, c("NEON_2012", "NEON_2017", "NEON_2022", "NEON_2024"))
# plot these scans
terra::plot(chms_NEON, col = viridis(100), main = "NEON Scans (2012, 2017, 2022, 2024)")2.2 Summary Statistics
We can calculate summary statistics for the DTMs:
# calculate summary statistics
# mean elevation from each scan
means_dtms = global(dtms, "mean", na.rm = TRUE)
print(means_dtms) mean
3DEP_2013.5 361.9564
GLiHT_2012 361.8035
GLiHT_2017 361.4712
GLiHT_2021 361.4877
NEON_2012 362.1649
NEON_2014 361.9043
NEON_2016 362.0542
NEON_2017 361.7781
NEON_2018 361.7946
NEON_2019 361.7137
NEON_2022 361.8275
NEON_2024 361.9200
We can also calculate summary statistics for the CHMs:
# mean canopy height from each scan
means_chms = global(chms, "mean", na.rm = TRUE)
print(means_chms) mean
3DEP_2013.5 16.15074
GLiHT_2012 17.54354
GLiHT_2017 18.31373
GLiHT_2021 17.89633
NEON_2012 17.69256
NEON_2014 17.08554
NEON_2016 17.74070
NEON_2017 17.68280
NEON_2018 18.20092
NEON_2019 18.32963
NEON_2022 18.36571
NEON_2024 18.11678
Look at the mean canopy heights across different scans and years:
Questions to consider: - What is the typical mean canopy height? Does it match your pre-analysis estimate? - Is there variation between different sensors (NEON vs GLiHT vs 3DEP)? - Do the values change over time, or are they relatively stable? - The winter scan (3DEP) - how does it compare to summer scans?
Write down your answers: - Mean canopy height: _____ meters - Range of variation: _____ to _____ meters - Maximum tree height observed: _____ meters (you can calculate this with global(chms, "max", na.rm = TRUE))
2.3 Temporal Comparison
We notice that there is some variation. Luckily (and this is rare), we have acquisitions from different institutions over similar time periods, so we can compare changes in time.
# calculate changes in time both for GLiHT and NEON canopy height estimates from 2012 to 2017
means_chms_dt = data.table(flight = rownames(means_chms), hmean = round(means_chms$mean, 2))
print(means_chms_dt[flight %like% "2017" | flight %like% "2012"]) flight hmean
<char> <num>
1: GLiHT_2012 17.54
2: GLiHT_2017 18.31
3: NEON_2012 17.69
4: NEON_2017 17.68
deltaheight_GLiHT = as.numeric(means_chms_dt[flight == "GLiHT_2017", "hmean"] -
means_chms_dt[flight == "GLiHT_2012", "hmean"])
deltaheight_NEON = as.numeric(means_chms_dt[flight == "NEON_2017", "hmean"] -
means_chms_dt[flight == "NEON_2012", "hmean"])
cat("GLiHT height change 2012-2017:", deltaheight_GLiHT, "m\n")GLiHT height change 2012-2017: 0.77 m
cat("NEON height change 2012-2017:", deltaheight_NEON, "m\n")NEON height change 2012-2017: -0.01 m
Something is a bit odd. For example, if we look at GLiHT scans, then height has increased from 2012 to 2017, but using NEON, it has remained stable. Let’s investigate further by examining the digital terrain models.
# calculate changes in time both for GLiHT and NEON from 2012 to 2017, now for the digital terrain models
means_dtms_dt = data.table(flight = rownames(means_dtms), hmean = round(means_dtms$mean, 2))
print(means_dtms_dt[flight %like% "2017" | flight %like% "2012"]) flight hmean
<char> <num>
1: GLiHT_2012 361.80
2: GLiHT_2017 361.47
3: NEON_2012 362.16
4: NEON_2017 361.78
deltaheight_GLiHT_dtm = as.numeric(means_dtms_dt[flight == "GLiHT_2017", "hmean"] -
means_dtms_dt[flight == "GLiHT_2012", "hmean"])
deltaheight_NEON_dtm = as.numeric(means_dtms_dt[flight == "NEON_2017", "hmean"] -
means_dtms_dt[flight == "NEON_2012", "hmean"])
cat("GLiHT DTM change 2012-2017:", deltaheight_GLiHT_dtm, "m\n")GLiHT DTM change 2012-2017: -0.33 m
cat("NEON DTM change 2012-2017:", deltaheight_NEON_dtm, "m\n")NEON DTM change 2012-2017: -0.38 m
Both show a similar change in topography, so that does not explain the canopy height changes. But we can also see that, in practice, we have a change of ~30 cm in absolute elevation values in time (which may be an artefact), but also a ~30 cm difference for the scans in the same years. This is larger than the often claimed accuracy of 10 cm for ALS data.
Note that we could also directly calculate the difference between rasters:
# spatial plotting of height differences between two scans
hdiff_2012_2017 = chms$NEON_2017 - chms$NEON_2012
terra::plot(hdiff_2012_2017, col = viridis(100, option = "turbo"),
main = "Height difference (m): NEON 2017 - NEON 2012")3 Guiding Question: How Does Forest Structure Change at Harvard Forest?
Motivation:
- Relevance for disturbance monitoring
- Relevance for carbon monitoring
Research Questions:
- What is the typical annual height growth rate of young trees at Harvard Forest, in meters per year?
- Is Harvard Forest growing in height, balanced, or is it losing height?
- What percentage of Harvard Forest experiences disturbance?
- What are the typical noise levels between ALS acquisitions? (compare GLiHT and NEON)
4 Approaches
4.1 Regrowth Analysis
We define young forests and assess height growth.
# height growth with NEON
# define a threshold for what's considered disturbed
threshold_growing = 2
# define areas that were below threshold previously and are above it 5 years later
# and assign the 2012 canopy height to them
neon_2012regrowing = ifel(chms$NEON_2012 <= threshold_growing &
chms$NEON_2017 >= threshold_growing,
chms$NEON_2012, NA)
# now calculate the regrowth with both layers
hdiff_regrowing = chms$NEON_2017 - neon_2012regrowing
hdiff_regrowing_peryear = hdiff_regrowing / 5
# calculate statistics
cat("Mean annual growth:", global(hdiff_regrowing_peryear, "mean", na.rm = TRUE)$mean, "m/year\n")Mean annual growth: 0.9349838 m/year
cat("Max annual growth:", global(hdiff_regrowing_peryear, "max", na.rm = TRUE)$max, "m/year\n")Max annual growth: 5.800006 m/year
# plot a histogram
ggplot(data = as.data.table(hdiff_regrowing_peryear)) +
geom_histogram(aes(x = NEON_2017), alpha = 0.5, fill = "red", bins = 30) +
theme_classic() +
xlab("Annual height growth (m / year)") +
ylab("Frequency")Look at your calculated growth rates:
Questions: - What is the mean annual growth rate you calculated? - What is the maximum growth rate observed? - Does the distribution look realistic? - Compare with your pre-analysis estimate - how close were you?
Ecological context: - Temperate deciduous trees typically grow 0.3-0.8 m/year when young (< 5m tall) - Growth rates > 1.5 m/year are suspicious and may indicate measurement artifacts - Think about: what percentage of pixels show “reasonable” vs “unreasonable” growth?
Record your values: - Mean growth rate (2m threshold, 5 years): _____ m/year - Does this seem realistic? Yes / No / Uncertain
Now what if we take a different threshold?
# height growth with NEON
# define a threshold for what's considered disturbed
threshold_growing = 5
# define areas that were below threshold previously and are above it 5 years later
neon_2012regrowing = ifel(chms$NEON_2012 <= threshold_growing &
chms$NEON_2017 >= threshold_growing,
chms$NEON_2012, NA)
# now calculate the regrowth with both layers
hdiff_regrowing = chms$NEON_2017 - neon_2012regrowing
hdiff_regrowing_peryear = hdiff_regrowing / 5
# calculate statistics
cat("Mean annual growth (5m threshold):", global(hdiff_regrowing_peryear, "mean", na.rm = TRUE)$mean, "m/year\n")Mean annual growth (5m threshold): 1.129185 m/year
cat("Max annual growth (5m threshold):", global(hdiff_regrowing_peryear, "max", na.rm = TRUE)$max, "m/year\n")Max annual growth (5m threshold): 5.800006 m/year
# plot a histogram
ggplot(data = as.data.table(hdiff_regrowing_peryear)) +
geom_histogram(aes(x = NEON_2017), alpha = 0.5, fill = "red", bins = 30) +
theme_classic() +
xlab("Annual height growth (m / year)") +
ylab("Frequency")What if we calculate it with GLiHT?
# height growth with GLiHT
# define a threshold for what's considered disturbed
threshold_growing = 5
# define areas that were below threshold previously and are above it 5 years later
gliht_2012regrowing = ifel(chms$GLiHT_2012 <= threshold_growing &
chms$GLiHT_2017 >= threshold_growing,
chms$GLiHT_2012, NA)
# now calculate the regrowth with both layers
hdiff_regrowing = chms$GLiHT_2017 - gliht_2012regrowing
hdiff_regrowing_peryear = hdiff_regrowing / 5
# calculate statistics
cat("Mean annual growth (GLiHT):", global(hdiff_regrowing_peryear, "mean", na.rm = TRUE)$mean, "m/year\n")Mean annual growth (GLiHT): 1.492952 m/year
cat("Max annual growth (GLiHT):", global(hdiff_regrowing_peryear, "max", na.rm = TRUE)$max, "m/year\n")Max annual growth (GLiHT): 6.152002 m/year
# plot a histogram
ggplot(data = as.data.table(hdiff_regrowing_peryear)) +
geom_histogram(aes(x = GLiHT_2017), alpha = 0.5, fill = "red", bins = 30) +
theme_classic() +
xlab("Annual height growth (m / year)") +
ylab("Frequency")What if we calculate it with GLiHT, and over a longer time period?
# height growth with GLiHT over multiple time periods
threshold_growing = 5
# define areas that were below threshold in 2012
gliht_2012regrowing = ifel(chms$GLiHT_2012 <= threshold_growing &
chms$GLiHT_2017 >= threshold_growing,
chms$GLiHT_2012, NA)
# now calculate the regrowth with both 2017 and 2021 scans
hdiff_regrowing = subset(chms, c("GLiHT_2017", "GLiHT_2021")) - gliht_2012regrowing
hdiff_regrowing_peryear = hdiff_regrowing / c(5, 9)
# calculate statistics
cat("Mean annual growth 2012-2017:", global(hdiff_regrowing_peryear$GLiHT_2017, "mean", na.rm = TRUE)$mean, "m/year\n")Mean annual growth 2012-2017: 1.492952 m/year
cat("Mean annual growth 2012-2021:", global(hdiff_regrowing_peryear$GLiHT_2021, "mean", na.rm = TRUE)$mean, "m/year\n")Mean annual growth 2012-2021: 0.8637589 m/year
# plot a histogram
ggplot(data = as.data.table(hdiff_regrowing_peryear)) +
geom_histogram(aes(x = GLiHT_2017), alpha = 0.5, fill = "red", bins = 30) +
geom_histogram(aes(x = GLiHT_2021), alpha = 0.5, fill = "blue", bins = 30) +
theme_classic() +
xlab("Annual height growth (m / year)") +
ylab("Frequency") +
labs(subtitle = "Red = 2012-2017 (5 years), Blue = 2012-2021 (9 years)")4.2 One-Year Changes
What about analyzing one-year changes?
# height growth with NEON, but only for one year
threshold_growing = 2
# define areas that were below threshold previously and are above it 1 year later
neon_2016regrowing = ifel(chms$NEON_2016 <= threshold_growing &
chms$NEON_2017 >= threshold_growing,
chms$NEON_2016, NA)
# now calculate the regrowth with both layers
hdiff_regrowing = chms$NEON_2017 - neon_2016regrowing
hdiff_regrowing_peryear = hdiff_regrowing
# calculate statistics
cat("Mean annual growth 2016-2017:", global(hdiff_regrowing_peryear, "mean", na.rm = TRUE)$mean, "m/year\n")Mean annual growth 2016-2017: 4.31011 m/year
cat("Max annual growth 2016-2017:", global(hdiff_regrowing_peryear, "max", na.rm = TRUE)$max, "m/year\n")Max annual growth 2016-2017: 28.56 m/year
# plot a histogram
ggplot(data = as.data.table(hdiff_regrowing_peryear)) +
geom_histogram(aes(x = NEON_2017), alpha = 0.5, fill = "red", bins = 30) +
theme_classic() +
xlab("Annual height growth (m / year)") +
ylab("Frequency")Now what about the following year?
# height growth with NEON
threshold_growing = 2
# define areas that were below threshold previously and are above it 1 year later
neon_2017regrowing = ifel(chms$NEON_2017 <= threshold_growing &
chms$NEON_2018 >= threshold_growing,
chms$NEON_2017, NA)
# now calculate the regrowth with both layers
hdiff_regrowing = chms$NEON_2018 - neon_2017regrowing
hdiff_regrowing_peryear = hdiff_regrowing
# calculate statistics
cat("Mean annual growth 2017-2018:", global(hdiff_regrowing_peryear, "mean", na.rm = TRUE)$mean, "m/year\n")Mean annual growth 2017-2018: 6.049758 m/year
cat("Max annual growth 2017-2018:", global(hdiff_regrowing_peryear, "max", na.rm = TRUE)$max, "m/year\n")Max annual growth 2017-2018: 32.54001 m/year
# plot a histogram
ggplot(data = as.data.table(hdiff_regrowing_peryear)) +
geom_histogram(aes(x = NEON_2018), alpha = 0.5, fill = "red", bins = 30) +
theme_classic() +
xlab("Annual height growth (m / year)") +
ylab("Frequency")Is this true regrowth?
How could we better describe this?
Consider:
- Are we measuring actual tree growth or just detection artifacts?
- Could lateral crown expansion be confounding our measurements?
- What role do measurement uncertainties play in these estimates?
4.3 Addressing Lateral Ingrowth
One idea: find areas without lateral ingrowth by removing buffer areas around low vegetation.
# slightly adjusted procedure
# set neon_2017regrowing to 1 or NA to create a mask
threshold_growing = 5
neon_2017regrowing = ifel(chms$NEON_2017 <= threshold_growing &
chms$NEON_2018 >= threshold_growing,
1, NA)
terra::plot(neon_2017regrowing, col = "black", main = "Areas with regrowth (before filtering)")# now remove buffer areas around low areas, by using a focal filter
# that sets everything to NA at the borders
neon_2017regrowing = focal(neon_2017regrowing, w = 3, "mean", na.rm = FALSE) # 3 pixel window size
terra::plot(neon_2017regrowing, col = "black", main = "Areas with regrowth (after filtering)")# fill in height
neon_2017regrowing = ifel(!is.na(neon_2017regrowing), chms$NEON_2017, NA)
# now calculate the regrowth with both layers
hdiff_regrowing = chms$NEON_2018 - neon_2017regrowing
hdiff_regrowing_peryear = hdiff_regrowing
# calculate statistics
cat("Mean annual growth (filtered):", global(hdiff_regrowing_peryear, "mean", na.rm = TRUE)$mean, "m/year\n")Mean annual growth (filtered): 3.812533 m/year
cat("Max annual growth (filtered):", global(hdiff_regrowing_peryear, "max", na.rm = TRUE)$max, "m/year\n")Max annual growth (filtered): 22.51999 m/year
# plot a histogram
ggplot(data = as.data.table(hdiff_regrowing_peryear)) +
geom_histogram(aes(x = NEON_2018), alpha = 0.5, fill = "red", bins = 30) +
theme_classic() +
xlab("Annual height growth (m / year)") +
ylab("Frequency") +
labs(subtitle = "After removing edge effects")What are we missing? What could be the issue?
Think about:
- Edge detection artifacts
- Sensor noise and uncertainties
- Seasonal effects (leaf-on vs leaf-off)
- Co-registration errors between scans
- Natural variability in growth rates
4.4 Disturbance Analysis
To be developed in class discussion
4.5 Tree-Based Changes
To be developed in class discussion
5 Summary and Discussion
Growth Rate Estimates:
- Threshold selection significantly affects results
- One-year vs multi-year comparisons show different patterns
- GLiHT and NEON data produce different estimates
- Edge effects and lateral ingrowth confound measurements
Measurement Uncertainties:
- DTM differences between scans: ~30 cm
- Larger than commonly cited 10 cm accuracy
- Systematic biases between different sensors
- Temporal artifacts in elevation measurements
Sources of Uncertainty:
- Sensor differences: GLiHT vs NEON vs 3DEP
- Temporal mismatches: Different acquisition dates and times
- Phenological effects: Leaf-on vs leaf-off conditions
- Co-registration errors: Alignment between scans
- Threshold sensitivity: Growth detection depends heavily on cutoff values
Confounding Factors:
- Lateral crown expansion vs vertical growth
- Edge effects at transition zones
- Detection vs actual growth
- Noise vs signal in one-year changes
For Growth Analysis:
- Use multiple time intervals to validate patterns
- Compare results from different sensors
- Account for edge effects and lateral expansion
- Consider measurement uncertainties in interpretation
- Validate with field data when possible
For Disturbance Detection:
- Define clear thresholds based on ecology and error
- Account for sensor noise levels
- Use spatial context (focal operations)
- Combine multiple temporal snapshots
- Ground-truth selected areas
Consider these questions for your analysis:
- What is a realistic annual growth rate for this forest type?
- Compare with field studies
- Consider forest age and species composition
- Account for measurement uncertainties
- How can we separate growth from measurement artifacts?
- Use multiple temporal baselines
- Compare sensors in the same year
- Apply spatial filters
- What level of change is detectable given sensor noise?
- Estimate minimum detectable change
- Consider signal-to-noise ratios
- Validate against independent data
6 Exercises and Extensions
- Compare different time intervals:
- Calculate growth rates for 1, 5, and 10-year periods
- How do estimates change with time interval?
- Test threshold sensitivity:
- Try thresholds from 1-10 m
- Plot how results vary
- What threshold seems most reasonable?
- Sensor comparison:
- Compare GLiHT, NEON, and 3DEP for the same time period
- Quantify systematic differences
- Which is most reliable?
- Spatial analysis:
- Map areas of growth vs decline
- Identify disturbance hotspots
- Relate to topography or other factors
- Validation:
- Compare with published growth rates for similar forests
- Identify field plots for ground-truthing
- Assess realism of your estimates
7 Conclusion
This tutorial demonstrates the challenges and opportunities of using multi-temporal ALS data to monitor forest structural changes. While ALS provides unprecedented spatial coverage and detail, careful attention to measurement uncertainties, sensor differences, and methodological artifacts is essential for robust conclusions.
Key Takeaways:
- ALS can detect forest height changes, but with significant uncertainties
- Sensor choice, threshold selection, and temporal baseline all strongly affect results
- Edge effects and lateral expansion confound vertical growth estimates
- Multiple lines of evidence and validation are essential
- Critical thinking about data quality is as important as analytical technique
Future Directions:
- Integration with other data sources (optical imagery, field plots)
- Machine learning for change detection
- Improved co-registration methods
- Individual tree tracking across time
- Uncertainty quantification frameworks
This tutorial was designed to encourage critical thinking about ALS-based change detection. The emphasis is on understanding limitations and uncertainties, not just applying methods.