BIOL 5404 Biological Data Science in R

Author

Roz Dakin

Course Description: This course introduces the practical skills needed to answer biological research questions with large and complex datasets in a reproducible manner. There is a growing demand for data analyses that are openly available, well annotated, and reusable by others. In this course, we will cover how to tidy, transform, visualize, model, and communicate datasets in R, with a focus on large and complex data. The course is aimed at graduate students with some prior experience with statistics at the undergraduate or graduate level, but no prior experience with R is required.

Professor: Dr. Roz Dakin Office: CTTC 4440
Fridays, 11:35 - 14:25
In person, synchronous course
Southam Hall 303

Goals for this Course

  1. Develop proficiency in the R programming language
  2. Practice using R to explore, wrangle, graph, and model realistic (complex) research datasets
  3. Gain experience troubleshooting (analyzing & solving errors in R code)
  4. Apply (learn to use) a new data technique and/or package in R
  5. Create an independent data analysis following principles of reproducible research
  6. Use R markdown to communicate your work

Resources

We will use these two (free!) books:

We may also use Advanced R 2e by Hadley Wickham


Assignments


Evaluation

Item Weight Details
Quizzes 10% Short MC quizzes on Brightspace each week
Assignments 40% 6 total, only your best 5 will count
Peer evaluation 5% On each assignment, you will grade your peers and provide feedback on their code
Independent report 45% Communicate the results of your own data wrangling and exploratory analysis, done individually

Weekly Schedule

Week Date Topics Activities
1 Jan 10 Intro, philosopy, tidy data
  • Install R and R Studio
  • Read Roche et al.
  • Read Alston & Rick
  • Irizarry Section 1
  • r4ds Introduction
2 Jan 17 R basics
  • Irizarry Section 2
  • r4ds Section 2
  • Assignment #1
3 Jan 24 Programming basics
  • Read Noble 2009
  • Irizarry Sections 3 & 6
  • r4ds Section 6
  • Extra: r4ds Section 27
4 Jan 31 tidyverse & data
  • Irizarry Section 4
  • Assignment #2
5 Feb 7 tidyverse & data
  • Irizarry Sections 11-12
  • Read Broman & Woo
6 Feb 14 Visualization, part 1
  • Irizarry Sections 7-8
  • Assignment #3
WINTER BREAK
  • Read Kelleher & Wagener
  • Read Dekker & Silva
  • Work on book sections to come…
8 Feb 28 Visualization, part 2
  • Irizarry Sections 9-10
  • Assignment #4
9 Mar 7 Exploratory data analysis
  • r4ds Section 10
10 Mar 14 ** No formal meeting **
  • Assignment #5
11 Mar 21 Strings and dates
  • Irizarry Section 13
  • Irizarry 16.1-16.3, 16.5-16.6, 16.9
12 Mar 28 markdown and git
  • r4ds Section 28
  • Assignment #6
Apr 13 Independent report due

Late Policy

Work that is not submitted by the deadline will receive a grade of 0, unless you have made a specific agreement with me before the deadline.


Academic Integrity

As a class, we will comply with Carleton’s guidelines on academic integrity.

It’s OK (and encouraged) to work with other students on weekly assignments and quizzes in this course. I strongly encourage you to help each other out when troubleshooting in R. When working together, you should strive to contribute so that working together is mutually beneficial.

The exploratory data analysis and independent report must represent your own original investigation of the data (though you can obtain the data anywhere you wish – there is no requirement that you generated the data).

As part of this course, you will also be expected to evaluate and provide feedback on work by your peers. Please respect each other. I expect you to contribute by providing your peers with thoughtful feedback, and to be fair and honest in your peer evaluations.


Statement on ChatGPT/Generative AI usage

I encourage you to use AI tools to help with troubleshooting in R.

As our understanding of the uses of AI and its relationship to student work and academic integrity continue to evolve, students are required to discuss their use of AI in any circumstance not described here with the course instructor to ensure it supports the learning goals for the course.


Accommodations

Please review the course schedule and contact me with any requests for academic accommodation during the first two weeks of class, or as soon as possible after the need is known to exist.