Final Project
Welcome to the final project! This is your chance to bring together everything youβve learned β data wrangling, visualization, R programming, and reproducible research β into a single, polished analysis.
- Your Mission: Perform a complete data analysis and produce a fully reproducible Quarto report with publication-quality figures.
- Youβll submit:
- A zipped RStudio project folder containing your Quarto document, data, bibliography, and all rendered outputs.
- See Step 1 for the exact folder structure.
- Deadline: June 7th, 2026 β 11:59 PM
| Criterion | Weight | What weβre looking for |
|---|---|---|
| Required elements | 40% | Does your report include every component listed below (YAML, introduction, analysis, visualization, references)? |
| Readability | 15% | Are axis labels, titles, and annotations large enough to read comfortably? Is the narrative clear and well-organized? |
| Aesthetics & creativity | 15% | Did you go beyond the minimum? Custom themes, smart color palettes, clean layouts, thoughtful narrative, etc. |
| Analysis depth | 20% | Is the analysis substantive? Did you use appropriate wrangling, transformations, and/or programming techniques from the course? |
| Reproducibility | 10% | Does your project render top-to-bottom from the zipped folder without errors? Are file paths relative? |
So, get your creativity (and code) fired up. Letβs begin!
Step 1 β Project Setup
- Create a new RStudio project for this assignment.
- Organize your files β your project folder should follow this structure:
final_project/
βββ final_project.Rproj
βββ data/
β βββ my_data.csv
βββ final_project.qmd
βββ references.bib
βββ final_project.html β rendered output
βββ final_project.pdf β rendered output
Use here::here() for all file paths so your project works on any machine (see Chapter 9):
library(here)
my_data <- read_csv(here("data", "my_data.csv"))Step 2 β Choose Your Data
Pick one of the following options:
Option A: Analyze Your Own Data β Use a dataset you have collected or found online (from a reputable source). The dataset should not be one we used directly in class examples.
Option B: Reproduce/Extend a Published Analysis β Find a published research paper that includes data analysis. Reproduce (as closely as possible) some of its key results, and then extend or improve upon the original analysis or figures.
If you donβt have your own data, here are some great sources:
- Dryad β open ecological and evolutionary datasets
- GBIF β global biodiversity records
- TidyTuesday β curated weekly datasets for R practice
- data.gov β U.S. government open data
- Kaggle Datasets β a wide variety of datasets across domains
Pick something you find genuinely interesting β the project is more fun (and better) when you care about the data!
Step 3 β Write the Quarto Report
Your final_project.qmd file must include the following sections. Think of it as a mini research paper:
YAML Header
Your Quarto document should begin with a proper YAML header:
---
title: "Your Descriptive Project Title"
author: "Your Name, UCLA"
date: today
format:
html:
theme: cosmo
code-fold: true
pdf:
documentclass: article
bibliography: references.bib
---The code-fold: true option lets readers expand code blocks on demand, keeping the narrative clean (see Chapter 17).
Introduction
Provide a brief introduction that explains:
- The purpose of your analysis and the research question(s) you are addressing.
- The dataset you are using β where it comes from, what it contains, and why itβs interesting.
- A roadmap of what the reader will find in the rest of the report.
Data Analysis
Conduct a substantive analysis to address your research question(s). Your analysis should demonstrate skills from the course. This could involve (but is not limited to):
- Data wrangling β importing, cleaning, joining, and transforming data using
dplyrandtidyr(Chapters 7β9). - Descriptive statistics β calculating summary statistics, group comparisons, etc.
- R programming β using vectorization, custom functions, or
purrrto automate repetitive tasks (Chapters 19β22).
Do not just show code and output β interpret your results. Explain what each analysis step reveals about your research question. The narrative is just as important as the code.
Visualization
Include at least two publication-quality figures. Each figure must include:
- A descriptive title that communicates the main message (use
labs(title = ...)). - Readable text sizes β axis labels, titles, and annotations should be large enough to read comfortably. The defaults are almost always too small (Chapter 6).
- A non-default theme β pick one from
jtools,hrbrthemes,ggthemr, etc. (Chapter 6). - Export with
ggsave()β specify explicitwidthandheight(Chapter 6).
Remember the key lessons from visualization chapters:
- Match plot type to data β amounts (Ch. 10), distributions (Ch. 11), trends (Ch. 12), associations (Ch. 13), or other types (Ch. 14).
- Use color intentionally β pick a curated palette from
MoMAColors,MetBrewer,PNWColors, etc. (Chapter 15). - Advanced techniques β consider multi-panel layouts with
patchwork, interactive plots withplotly, or annotations for emphasis (Chapter 16).
References
Cite at least two relevant sources using a BibTeX file (references.bib). You can use any citation style.
Use the references.bib file to store your citations. In your Quarto document, cite them with @key or [@key] syntax (see Chapter 18):
According to @wickham2016, tidy data has a specific structure...You can find BibTeX entries on Google Scholar (click the cite button β BibTeX) or from journal websites.
Step 4 β Render and Submit
- Render your Quarto document to both HTML and PDF formats.
- Verify reproducibility β close RStudio, reopen the project, and render again from scratch. Everything should work without errors.
- Zip the entire project folder and submit.
Before submitting, make sure: