BIOL 5404 Assignment 6

Author

Roz Dakin

Due on

March 30, 2025

Back to assignment list

Instructions: your own EDA

  • Create a new .R script to complete this assignment in your local assignments folder for this course.
  • You’ll submit the .R script AND the necessary data files on Brightspace.
  • Put your FIRST and LAST NAME in the file name of the script.
  • Put your name at the top of the script as well.
  • Please do not include your student ID, just your name is enough.
  • Make sure your script is organized and legible – use principles learned throughout this course.
  • Be sure to load all packages needed at the top of your script.

For this Assignment, you’ll submit your own draft EDA in a tidy and well-annotated R script. The EDA you present here will form basis of your final report. The goal of A6 is to get started on your EDA project and get feedback on your work, before you format it into a spiffy-looking report at the end of the term. After all, EDA is iterative!

As such, there are no questions for this assignment.

Before you begin, read the PDF “Instructions for the Independent Report” on Brightspace. The instructions PDF outlines the goals/criteria of your EDA and suggests some tips on choosing a dataset/question to work with.

Your EDA should include:

  • importing – make sure you use relative path(s)!
  • hygiene and tidying
  • transformation, as needed
  • wrangling steps, as needed
  • several visualizations

In your script for A6, include comments throughout that explain key points to the user, including what your research question(s) is/are, where your dataset comes from, and what specific actions you are performing in the script and why.

As with our other assignments, A6 will be evaluated out of 10:

  • 5 points for completeness (is the code doing what you intend?)
  • 5 points for clarity (can the user follow along/understand what’s being done?)
  • …and the graders will provide some commentary/feedback. The idea is that this feedback should be useful to you when finalizing your markdown report in April.

Example EDAs

EDA on pop from the year 2000

This is an example exploratory analysis of billboard-charting songs from the year 2000. It uses the billboard dataset that comes with the tidyr package within tidyverse. You can see the nicely formatted version here (made with markdown, and still work-in-progress!). Before making the formatted version, I started with a regular .R script, which you can get here. This should give you an idea of the general process.

Previous reports in this class

Here are some excellent examples of reports by previous students who were willing to share, for your benefit! They nicely show the diversity and key components of exploratory data analyses:

Please note that many of the examples above go to FAR FAR greater lengths than necessary for this course. This often happens when the project is part of the student’s primary or thesis research. In those cases, students often do a lot more than needed to fulfill the requirements for A6 and their final report.

You can also take a look at the files that generated some previous student projects here:

https://drive.google.com/drive/folders/1wzDkJAqyJUivATaQjAUnWieyxtrq5OkN?usp=sharing#

NOTE: for A6, you will not yet be creating a “pretty” markdown version yet. The goal for A6 is to draft your EDA as a regular-old R script, applying tools and concepts that we’ve been working on all term.

For A6, you should submit the following files on Brightspace:

  • the .R script of your EDA
  • any data files needed to make your .R script go!