PSet 2
Welcome to your second problem set! In this assignment, you will import and tidy real data to perform exploratory analyses. We will replicate parts of the analysis from a recent paper at the interface between ecology and evolution. You can read the paper here.
What to Turn In
- Submission Package: A folder (or zip file) that contains:
- RStudio project.
- Your code for data wrangling and analysis.
- The two final publication-ready figures.
- Deadline: February 23rd, 2025
Assignment Instructions
Project Setup
- Create a New Project: Start a new RStudio project.
- Download Data:
- Download the full data archive from here.
- This archive contains two CSV files:
- Snail Data:
PSet2_snail.csv
- Vegetation Zone Totals:
PSet2_vegzonetotals.csv
- Snail Data:
- Organize Your Files: Place the downloaded CSV files in a subfolder named
data
within your project directory.
Data Import
- Load the Data: Import both CSV files into R.
- Hint: Use RSudio import shortcut and then copy-paste the associated
read_csv()
code chunk into your script.
Data Cleaning: Filter Out Specific Islands
- In the
data_snail
tibble, remove all rows where the island column is “CH”, “ED”, or “GA”. - Hint: Think whether we should use the
filter()
orselect()
verb?
- In the
Join Data:
- Join
data_snail
withdata_veg
using the common variableisland
. - Hint: Consider whether a
left_join()
,full_join()
, or another type of join is most appropriate.
- Join
Data Transformation: Normalize Species Diversit
The variable
spdiv
(species diversity) needs to be normalized based on habitat type:- If habitat is “Arid”, compute:
normalized_spdiv = spdiv / AridTotal
- If habitat is “Humid”, compute:
normalized_spdiv = spdiv / HumidTotal
- If habitat is “Arid”, compute:
Hint: Use
mutate()
together withif_else()
orcase_when()
to create a new normalized species diversity variable.
Data Visualization: Replicate Figure 5 from the Paper
- Create a scatter plot using
ggplot2
that shows the relationship between the normalized species diversity (normalized_spdiv
) and functional diversity (funcdiv
). - Instead of using
geom_point()
, usegeom_text()
so that each data point is labeled with the island. - Aim to reproduce or even improve upon the appearance of Figure 5 from the paper.
- Create a scatter plot using
Create One More Publication-Quality Figure:
- Explpre another interesting relationship in the dataset (such as amount or distribution).
- Use what you have learned in the class to make the figure informative and aesthetically pleasing.