1 Welcome!

Quantitative skills are essential for biological research. We consider such skills to include the computational, statistical and mathematical techniques used to study life and living organisms, including aspects of big data, transparency and reproducibility in science. However, the breadth of quantitative techniques now employed in biology make it very likely that there may be no suitable local expertise within a student’s home institution or a biologist’s workplace.

This e-book consists of five independent chapters or “modules” designed to teach different quantitative skills to graduate students and biologists working in academia, government agencies and private organisations. A key theme is that while the techniques presented are from various disciplines (such as computer science, statistics and mathematics), they are presented in a way that is suitable for a biological audience. Our examples and approach reflect our personal use of these tools as researchers in biology.

Each module is designed to quickly get you up and running on the topic of interest in 3-5 hours. As such, these materials should be thought of a basic introduction. We provide pointers to more advanced materials, but our aim is to “jump-start” your use of these tools. On the other hand, these modules are not designed for absolute beginners. With the exception of the module on Git and Github, the materials are written assuming that learners have some basic familiarity with R, and a standard undergraduate background in calculus and univariate statistics.

Because the modules are designed to be independent (i.e., you don’t have read the whole e-book!), you can select subtopics that are directly relevant to your research. Our hope is that this approach will simultaneously provide more targeted training and reduce the time commitment that would be involved in more generalized formal courses. Pick and choose what you need!

The five modules are:

Git and GitHub (Andrew Edwards) – covers the tools widely used to share code, collaborate with colleagues and create a transparent record of research.
R Markdown (Andrew Edwards) – demonstrates how to create dynamic documents that are fully reproducible, with a clear link between the underlying R code and the resulting figures, tables and results.
Multivariate analysis: Clustering and Ordination (Kim Cuddington) – discusses multivariate quantitative methods that are used to understand data when there is more than one response (or measurement)
Machine learning and classification (Kim Cuddington) – continues explorations of multivariate data and in particular the topic of classification but using machine learning approaches.
Optimization (Brian Ingalls) – optimization is the act of identifying the extreme (cheapest, tallest, fastest, …) over a collection of possibilities. Applications include the manipulation (e.g. optimal harvesting or optimal drug dosing) and construction (e.g. robust synthetic genetic circuits) of biological systems, and experimental design.

1.1 Open science

In the spirit of open science (and using the tools introduced in the first two modules), this e-book was written collaboratively and openly in R Markdown, with files shared via GitHub here.

1.2 Accessibility Statement

We developed Building Skills in Quantitative Biology with a commitment to accessibility and usability for all learners.

The accessibility of these materials was assessed by the Centre for Extended Learning, University of Waterloo. This review was based on the WCAG 2.0 Guidelines at success criteria Level AA. The authors have addressed all known accessibility issues to the best of their abilities.

The following known accessibility issues persist and may cause difficulties for some persons with disabilities:

Code output is only distinguished from code input by 1. output has a black border box and 2. input is colour shaded
Internal links to figures and tables are only linked by the numbering (e.g., 5.1) of the object, rather than the full text (e.g., Figure 5.1)

Please let us know if you discover other issues on the issues tab for the github repository

1.3 Feedback

Despite extensive feedback from our student guinea pigs, we anticipate further revisions based on feedback, since we can consider this work as a living document. If you use these materials please take some time to let us know how they work for you, using this survey.

1.4 About the authors

Kim Cuddington is an Associate Professor at the University of Waterloo in the Department of Biology, with a cross-appointment to Applied Mathematics. She has several projects designed to improve quantitative education for biology students and has training in instructional design (University of Waterloo, Center for Teaching Excellence). She teaches courses that emphasize quantitative methods (such as Quantitative Ecology and Mathematical Modelling in Biology). Her research involves the use of mathematics, statistics, and computational approaches to answer questions regarding population and ecosystem dynamics, invasive species, and impacts of climate change – see www.ecotheory.ca. Kim has a PhD in biology yet was once told she was too mathematical for a position in a biology department.

Andrew Edwards is a Research Scientist with the Department of Fisheries and Oceans Canada (DFO) at the Pacific Biological Station in Nanaimo, British Columbia, and holds an Adjunct Professor position in the Department of Biology at the University of Victoria. He has previously developed in-person workshops on Git, GitHub and R Markdown (tools that are widely used in DFO), and co-developed a quantitative biology course at Dalhousie University. Current work includes developing methods for fitting size spectra to data, conducting fisheries stock assessments (including for Pacific Hake, the largest groundfish stock off the west coast of North America), and recent guidelines to using mathematical notation in ecology. Andy has a PhD in applied mathematics yet was once told he was too biological for a position in a math department.

Brian Ingalls is an Associate Professor in the Department of Applied Mathematics at the University of Waterloo, cross-appointed to the Departments of Biology and Chemical Engineering. He is the author of Mathematical Modeling in System Biology: An Introduction (2013), and has previously taught undergraduate and graduate courses on quantitative techniques in biology (including Computational Modelling of Cellular Systems and Mathematical Cell Biology). Brian has a PhD in mathematics and everyone has always been too in awe of him to make remarks about what positions he is qualified for.

1.5 Acknowledgments

We acknowledge support from the Government of Ontario through a grant from eCampusOntario, and the support of DFO and the Faculty of Science, University of Waterloo. The grant and the Faculty of Science each funded a student to help with creating the project. We thank Luwen Chang and Matthew Zhou for their amazing learning curves, and subsequent help coding up the modules. The eCampusOntario grant also funded several students to evaluate an early version of the materials. Lina Aragon Baquero, Lauren Banks, Madison Brook, Jacob Burbank, Nicole Gauvreau and Aranksha Dilip Thakor provided valuable feedback.

The Git and GitHub module builds upon workshop materials that were originally developed with Chris Grandin (DFO), who AME also thanks for assistance with the module.

1.6 Funding

This project is made possible with funding by the Government of Ontario and through eCampusOntario’s support of the Virtual Learning Strategy. To learn more about the Virtual Learning Strategy visit: https://vls.ecampusontario.ca.

1.7 Citation

Please cite this work as:

Cuddington, K, Edwards, A.M., and Ingalls, B. (2022). Building Skills in Quantitative Biology. https://www.quantitative-biology.ca

1.8 License

Kim Cuddington, Andrew M. Edwards, and Brian Ingalls. Building Skills in Quantitative Biology (https://www.quantitative-biology.ca) is available under an Ontario Commons License and a CC BY-NC-SA 4.0. . Contact kcudding AT uwaterloo DOT ca for more information

All materials licensed under the Ontario Commons License (Version 1.0) unless otherwise specified. The license deed is available to read at https://vls.ecampusontario.ca/wp-content/uploads/2021/01/Ontario-Commons-License-1.0.pdf.

All materials are also licensed under the Creative Commons BY-NC-SA 4.0 license unless otherwise specified. The license deed is available to read at https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.