Final Project

Welcome to the final project! This is your chance to bring together everything you’ve learned β€” data wrangling, visualization, R programming, and reproducible research β€” into a single, polished analysis.

How You’ll Be Graded
Criterion Weight What we’re looking for
Required elements 40% Does your report include every component listed below (YAML, introduction, analysis, visualization, references)?
Readability 15% Are axis labels, titles, and annotations large enough to read comfortably? Is the narrative clear and well-organized?
Aesthetics & creativity 15% Did you go beyond the minimum? Custom themes, smart color palettes, clean layouts, thoughtful narrative, etc.
Analysis depth 20% Is the analysis substantive? Did you use appropriate wrangling, transformations, and/or programming techniques from the course?
Reproducibility 10% Does your project render top-to-bottom from the zipped folder without errors? Are file paths relative?

So, get your creativity (and code) fired up. Let’s begin!

Step 1 β€” Project Setup

  1. Create a new RStudio project for this assignment.
  2. Organize your files β€” your project folder should follow this structure:
final_project/
β”œβ”€β”€ final_project.Rproj
β”œβ”€β”€ data/
β”‚   └── my_data.csv
β”œβ”€β”€ final_project.qmd
β”œβ”€β”€ references.bib
β”œβ”€β”€ final_project.html    ← rendered output
└── final_project.pdf     ← rendered output

Use here::here() for all file paths so your project works on any machine (see Chapter 9):

library(here)
my_data <- read_csv(here("data", "my_data.csv"))

Step 2 β€” Choose Your Data

Pick one of the following options:

  1. Option A: Analyze Your Own Data β€” Use a dataset you have collected or found online (from a reputable source). The dataset should not be one we used directly in class examples.

  2. Option B: Reproduce/Extend a Published Analysis β€” Find a published research paper that includes data analysis. Reproduce (as closely as possible) some of its key results, and then extend or improve upon the original analysis or figures.

If you don’t have your own data, here are some great sources:

  • Dryad β€” open ecological and evolutionary datasets
  • GBIF β€” global biodiversity records
  • TidyTuesday β€” curated weekly datasets for R practice
  • data.gov β€” U.S. government open data
  • Kaggle Datasets β€” a wide variety of datasets across domains

Pick something you find genuinely interesting β€” the project is more fun (and better) when you care about the data!

Step 3 β€” Write the Quarto Report

Your final_project.qmd file must include the following sections. Think of it as a mini research paper:

YAML Header

Your Quarto document should begin with a proper YAML header:

---
title: "Your Descriptive Project Title"
author: "Your Name, UCLA"
date: today
format:
  html:
    theme: cosmo
    code-fold: true
  pdf:
    documentclass: article
bibliography: references.bib
---

The code-fold: true option lets readers expand code blocks on demand, keeping the narrative clean (see Chapter 17).

Introduction

Provide a brief introduction that explains:

  • The purpose of your analysis and the research question(s) you are addressing.
  • The dataset you are using β€” where it comes from, what it contains, and why it’s interesting.
  • A roadmap of what the reader will find in the rest of the report.

Data Analysis

Conduct a substantive analysis to address your research question(s). Your analysis should demonstrate skills from the course. This could involve (but is not limited to):

  1. Data wrangling β€” importing, cleaning, joining, and transforming data using dplyr and tidyr (Chapters 7–9).
  2. Descriptive statistics β€” calculating summary statistics, group comparisons, etc.
  3. R programming β€” using vectorization, custom functions, or purrr to automate repetitive tasks (Chapters 19–22).
Important

Do not just show code and output β€” interpret your results. Explain what each analysis step reveals about your research question. The narrative is just as important as the code.

Visualization

Include at least two publication-quality figures. Each figure must include:

  1. A descriptive title that communicates the main message (use labs(title = ...)).
  2. Readable text sizes β€” axis labels, titles, and annotations should be large enough to read comfortably. The defaults are almost always too small (Chapter 6).
  3. A non-default theme β€” pick one from jtools, hrbrthemes, ggthemr, etc. (Chapter 6).
  4. Export with ggsave() β€” specify explicit width and height (Chapter 6).

Remember the key lessons from visualization chapters:

  • Match plot type to data β€” amounts (Ch. 10), distributions (Ch. 11), trends (Ch. 12), associations (Ch. 13), or other types (Ch. 14).
  • Use color intentionally β€” pick a curated palette from MoMAColors, MetBrewer, PNWColors, etc. (Chapter 15).
  • Advanced techniques β€” consider multi-panel layouts with patchwork, interactive plots with plotly, or annotations for emphasis (Chapter 16).

References

Cite at least two relevant sources using a BibTeX file (references.bib). You can use any citation style.

Use the references.bib file to store your citations. In your Quarto document, cite them with @key or [@key] syntax (see Chapter 18):

According to @wickham2016, tidy data has a specific structure...

You can find BibTeX entries on Google Scholar (click the cite button β†’ BibTeX) or from journal websites.

Step 4 β€” Render and Submit

  1. Render your Quarto document to both HTML and PDF formats.
  2. Verify reproducibility β€” close RStudio, reopen the project, and render again from scratch. Everything should work without errors.
  3. Zip the entire project folder and submit.
Reproducibility Checklist

Before submitting, make sure: