Find us on GitHub

A Data Carpentry Workshop

University of Zurich

May 30-31, 2016

9:00 am - 5:00 pm

Instructors: Lukas Weber, Frank Pennekamp, Geoffrey Fucile

Helpers: Felix Hartmann, Helen Lindsay, Malgorzata Nowicka, Gian-Marco Palmara, Stephan Schmeing, Charlotte Soneson, Stefan Wyder

General Information

Data Carpentry workshops are for any researcher who has data they want to analyze, and no prior computational experience is required. This hands-on workshop teaches basic concepts, skills and tools for working more effectively with data.

We will cover Data organization in spreadsheets, Introduction to the command line, Data analysis in R and Data visualization in R. Participants should bring their laptops and plan to participate actively. By the end of the workshop learners should be able to more effectively manage and analyze data and be able to apply the tools and approaches directly to their ongoing research.

Who: The course is aimed at PhD students, postdocs, and other researchers in biology. No prior computational or programming experience is assumed.

Where: Irchel campus, University of Zurich. The first day of the workshop (30th May) will be held in room Y55-L-06/08, and the second day (31st May) in room Y34-J-01. A map of the Irchel campus showing the locations of buildings Y55 and Y34 is available here. Please leave some time to find your way around the campus if you are unfamiliar with it. Directions are available with Google Maps.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating sytem (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below). They are also required to abide by Data Carpentry's Code of Conduct.

Contact: Please mail mark.robinson@imls.uzh.ch for more information.


Preliminary Schedule

Surveys

Please be sure to complete these surveys before and after the workshop.

Pre-workshop Survey

Post-workshop Survey

Day 1 (Monday, 30 May)

Morning Data organization in spreadsheets
Afternoon Introduction to the command line

Day 2 (Tuesday, 31 May)

Morning Data analysis in R
Afternoon Data visualization in R

Break times

Start: 9:00
Morning break: 10:30 - 11:00
Lunch break: 12:30 - 13:30
Afternoon break: 15:00 - 15:30
Finish: 17:00


Lesson content

Course materials will be based on the Data Carpentry Ecology Workshop and Genomics Workshop lessons, available on the Data Carpentry website.

  • Introduction to the command line: This session will be based on the Software Carpentry Unix Shell lesson. You will need a working Terminal application for the exercises. On Mac and Linux systems, this is installed by default (for Mac users: try searching for "Terminal" in Spotlight Search, or look for Terminal.app under Utilities in Applications). Windows users will need to install either MobaXterm or Git Bash. If you are using a Windows laptop, please download, install, and try to open one of these before the workshop. If time permits, we will also follow with a short introduction to R and RStudio using the R-ecology before-we-start lesson.
  • Data analysis in R: This session will follow the first four sections of the R ecology lesson to introduce you to data management and analysis in R. We then use the R section of the genomics lesson to introduce you to the dplyr package for data management and aggregation. The introduction to R is done without any specific dataset, whereas data.frames and dplyr will be demonstrated using the aggregated R-genomics dataset.
  • Data visualization in R: This session will follow the Data visualization with ggplot2 lesson materials from the Ecology Workshop. Note that there are two main plotting frameworks in R — base plotting and ggplot2. We will show you how to use ggplot2, since this is in our opinion the more powerful and flexible framework, and makes it easier to create publication-quality plots. Data file for plotting examples: surveys.csv. Data files for additional FCS example: visne_marrow1_sub.fcs and cluster_labels_sub.txt.

Setup

To participate in a Data Carpentry workshop, you will need working copies of the described software. Please make sure to install everything (or at least to download the installers) before the start of your workshop. Participants should bring and use their own laptops to insure the proper setup of tools for an efficient workflow once you leave the workshop.

Please follow these Setup Instructions; also see the additional setup information for the command line session above.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.