Intro

This document will track the steps taken to quantify the herbivore insect diversity within the WisAsp common garden. The data have been validated and cleaned to some degree before this step.

In the .html versions of this document, you can click on the boxes marked “code” to display the code that created each non-text section.

R Setup

This section contains code for the initial script setup such as loading packages and importing the data. Therefore, this section can be skipped in most cases.

All code is executed from the project root directory.

R markdown setup

This section includes the code chunks that set the knitr and global options, as well as loading libraries.

Rscript setup

This section, and the final section of this document sets up code chunks meant for the standalone .R script, as well as user-defined functions.

# source("scripts/utility-functions.R")
# cmdArgs <- commandArgs(trailingOnly = TRUE) #must use --args [options] on cmdln
# handled.args <- as.data.frame(rbind(
#   "-h" = c("","    ","display this help message"),
#   "-t" = c("thr","    ","rare insect cut-off level of thr")
# ))
# names(handled.args) <- c("option","","help")

# Help function for command line
# if ("-h" %in% cmdArgs) {
#   names(handled.args) <- NULL # remove names
#   cat("WisAsp community analysis r script\n\t-Clay Morrow\n\n") # print script info
#   cat("Arguments:")
#   print(handled.args)# print the argument data frame
#   cat("\n")
#   stop.quitely()
#   #"Execution halted" will always be displayed after -h
#     ## I am unsure why.
# }
# NOTE: Although this document is located in the 'scripts/' directory,
  ## the paths are all written relative to the root directory. All code must
  ## be run from the root directory. 
# load the project data
r.objects <- load("data/r-objects.RData")

The R objects that have been loaded are:

# print the object names
r.objects
##  [1] "insect.surveys"       "metadata.insects"     "trait.data"          
##  [4] "weather.data"         "grouped.insects.data" "full.data"           
##  [7] "common.insects"       "rare.insects"         "all.insects"         
## [10] "filtered.data"

Entire Popuplation Diversity Metrics

Additionally, we should calculate diversity indices for the entire WisAsp population for each year:

Richness

summ.fun <- function(x){c("sum" = sum(x,na.rm = TRUE))}

counts.by.event <- summaryBy(list(c(all.insects),c("survey.event")),
                             data = (insect.surveys %>% 
                                       filter()), 
                             FUN = summ.fun,
                             keep.names = TRUE)
# name the rows according to the ID
rownames(counts.by.event) <- counts.by.event$survey.event
# if (!.workstation$is.RStudio) {
#   cat("Species richness by survey event:\n")
# }

# 
(richness.table <- rbind(
  "all.insects" = rowSums(counts.by.event[,all.insects]>0),
  "common.insects" = rowSums(counts.by.event[,common.insects]>0)
))
##                jun16 aug16 jun17 aug17
## all.insects       77    78    34    61
## common.insects    18    18    18    18

When all species (including rare insects) are tallied, richness is substantially lower in Jun 2017 (and 2017 in general). This may be caused by a number of things:

  1. A large plot of trees directly NE of the plot was removed at the end of 2016 that may have reduced colonization events the subsequent year

  2. There were more survey days w/high temp. in 2016 than in 2017. Bud break also occurred later in 2017. The colder temperatures and later phenology of 2017 may not have allowed insect populations to grow as fast (or early) as the previous year.

This also helps justify the use of only our common insects in analysis. The common insects not present in each survey (if any) are listed below:

# Determing which, of the common insects, are missing from each survey
missing.common <- present.common <- list() # create empty list
for(i in levels(counts.by.event$survey.event)){ #loop through the survey events
  missing.common[[i]] <- names(counts.by.event[,common.insects]
                        )[(counts.by.event %>% filter(survey.event == i)
                           )[,common.insects] <= 0]
  present.common[[i]] <- names(counts.by.event[,common.insects]
                        )[(counts.by.event %>% filter(survey.event == i)
                           )[,common.insects] > 0]
}
missing.common
## $jun16
## character(0)
## 
## $aug16
## character(0)
## 
## $jun17
## character(0)
## 
## $aug17
## character(0)
table <- insect.surveys %>% group_by(survey.event) %>% summarize(sum(Tenthredinidae.sp2, na.rm = T))
names(table)[2] <- "Tenthridinidae.sp2"

table2 <- insect.surveys %>% group_by(survey.event) %>% summarize(sum(Green.Sawfly, na.rm = T))
names(table2)[2] <- "Green.Sawfly"

table3 <- insect.surveys %>% group_by(survey.event) %>% summarize(sum(Sawflies, na.rm = T))
names(table3)[2] <- "Sawflies"

(tab <- cbind(table,table2[,2],table3[,2]))

There was an explosion of sawflies in 2017 and I wonder/suspect that Tenthridinids were misidentified in the field as the Green sawfly (Nematus spp.). I am unsure what to do about this. Perhaps I should use the Sawflies column instead of the individual species (Green.Sawfly would also be used as it was counted seperately). I am going to leave this for now. This is one of the reasons we chose to use the grouped “Sawflies” column instead of the individual sawfly species.

Shannon Index

Here we will calculate the Shannon diversity index and species evenness for each survey event

# using the vegan package
shan.index <- diversity(x = insect.surveys[,common.insects],MARGIN = 2,index = "shannon")

data <- insect.surveys %>% filter(Min.per.Tree > 0)
S <- NULL
p_i <- list()
H <- NULL
E_H <- NULL
for(t in colnames(richness.table)) {
  # richness
  S[t] <- (richness.table["common.insects", t])
  ## species density (for p_i)
  sp.dens <- colSums(data[which(data$survey.event == t),
                          common.insects] / data$Min.per.Tree,
                      na.rm = TRUE)
  # proportion species i of total
  p_i[[t]] <- c(sp.dens / sum(sp.dens))
  # Shannon Index
  H[t] <- -sum(p_i[[t]] * log(p_i[[t]]))
  # Evenness
  E_H[t] <- H[t] / log(S[[t]])
}

Here is the total population shannon index and shannon evenness:

# if (!.workstation$is.RStudio) {
#   cat("Diversity indeces:\n")
# }

rbind("Shannon Index" = H,
      "Evenness" = E_H)
##                   jun16     aug16     jun17     aug17
## Shannon Index 1.8033569 0.9082747 1.9294974 0.5362879
## Evenness      0.6239187 0.3142415 0.6675603 0.1855429
# shan.index

Shannon Index by Tree

data <- full.data %>% filter(Min.per.Tree > 0) 
# %>%
#   select(Unique.ID,SerialNo,Block,Genet,
#          Min.per.Tree,survey.event,Date,
#          common.insects)

data[,common.insects] <- data[,common.insects]/data[,"Min.per.Tree"]

shannon.data <- data
#S: richness
shannon.data$richness <- 
  rowSums(data[,common.insects] > 0,na.rm = TRUE)
#p_i: proportion
shannon.data[,paste0(common.insects,".prop")] <-
  shannon.data[,common.insects]/rowSums(shannon.data[,common.insects])
#H: shannon index
shannon.data$shannon.index <- -rowSums(shannon.data[,paste0(common.insects,".prop")]*
  log(shannon.data[,paste0(common.insects,".prop")]),na.rm = TRUE)
#E: Evenness
shannon.data$evenness <- 
  shannon.data$shannon.index/log(shannon.data$richness)

# plot(shannon.data$shannon.index)

Anova:

library(car)
shannon.model <- lm(data = shannon.data, 
                    shannon.index ~ 
                      CT +
                      PG +
                      Vol + 
                      Genet + 
                      Block + 
                      Row + 
                      # Position +
                      survey.event + 
                      # Date +
                      (Date %in% survey.event) # + 
                      # (Date %in% Block) + 
                      # (Position %in% Block) 
                    )

round(type.2 <- Anova(shannon.model, type = 2),digits = 3)
(r.squared <- summary(shannon.model)[c("r.squared","adj.r.squared")])
## $r.squared
## [1] 0.2205152
## 
## $adj.r.squared
## [1] 0.1489568

We see that, although we do not capture much of the variation (\(R^2\) = 0.2205), Genet and Volume have a significant effect on the shannon index.

evenness.model <- update(shannon.model, evenness ~ .)
           
round(type.2 <- Anova(evenness.model, type = 2),digits = 3)
(r.squared <- summary(evenness.model)[c("r.squared","adj.r.squared")])
## $r.squared
## [1] 0.3239101
## 
## $adj.r.squared
## [1] 0.2580439

and for our evenness model, we can explain more of the variation (\(R^2\) = 0.3239) and we can see that Genet and CTs have an effect on species evenness.

Plots

require(ggplot2)
# reorder genet factor levels
genet.shannon <- shannon.data %>% group_by(Genet) %>% 
  summarize("mean.H" = mean(shannon.index,na.rm = TRUE),
            "mean.E" = mean(evenness,na.rm = TRUE),
            )

## Plot shannon index by genet
ggplot(data = shannon.data, 
       aes(x = factor(Genet, 
                      levels = genet.shannon$Genet[
                        order(genet.shannon$mean.H)]),
           y = shannon.index)
       ) + 
  stat_summary(fun.y = mean,
               fun.ymin = function(x){mean(x)-sd(x)},
               fun.ymax = function(x){mean(x)+sd(x)},
               geom = "pointrange",col = "darkgrey") +
  stat_summary(fun.y = mean, geom = "point", 
               col = "black",pch = "=",aes(fill = "mean")) +
  # stat_summary(fun.y = median, geom = "point", 
  #               col = "white",pch = "-", aes(fill = "mean")) +
  labs(x = "Genet", y = "ln(Diversity)", 
       title = "Variation in Shannon Index", subtitle =  "by Genet") + 
  theme(axis.text.x = element_blank(), axis.ticks.x = element_blank(),
        legend.title = element_blank())

# Plot shannon index by volume
ggplot(shannon.data,aes(x = Vol, y = shannon.index))+
  geom_point() +
  geom_smooth(method = "glm") + 
  labs(x = "Volume", y = "ln(Diversity)", 
       title = "Shannon Index by Volume")

# ggplot(shannon.data,aes(x = BA.all, y = shannon.index))+
#   geom_point() +
#   geom_smooth(method = "glm") + 
#   labs(x = "Basal Area", y = "ln(Diversity)", 
#        title = "Shannon Index by Volume")

Although there is a large range of within-genet variation of Diversity, there is also substatial among-genet variation.

Additionally, we can see the weak positive relationship between volume and diversity in our data

## Plot shannon evenness by genet
ggplot(data = shannon.data, 
       aes(x = factor(Genet, 
                      levels = genet.shannon$Genet[
                        order(genet.shannon$mean.E)]),
           y = evenness)
       ) + 
  stat_summary(fun.y = mean,
               fun.ymin = function(x){mean(x)-sd(x)},
               fun.ymax = function(x){mean(x)+sd(x)},
               geom = "pointrange", col = "darkgrey") +
  stat_summary(fun.y = mean, geom = "point", 
               col = "black",pch = "=", aes(fill = "mean")) +
  # stat_summary(fun.y = median, geom = "point", 
  #               col = "white",pch = "-") +
  labs(x = "Genet", y = "Evenness", 
       title = "Variation in Evenness", subtitle = "by Genet") + 
  theme(axis.text.x = element_blank(), 
        axis.ticks.x = element_blank(), 
        legend.title = element_blank())

# Plot evenness by CTs
ggplot(shannon.data,aes(x = CT, y = evenness))+
  geom_point() +
  geom_smooth(method = "glm") + 
  labs(x = "CT %", y = "Evenness", title = "Evenness by CT%")

Similar to diversity, species evenness has large within- and among-genet variation and we can see the positive relationship between CTs and species evenness.