Home

Getting started at CPU

The Computational Proteomics Unit web page: https://lgatto.github.io/cpu-lab/

Essential tools

Learn git (and a bit more about git), github and here.
Learn the shell
Arguably the most important piece of software when doing computational work is an editor. Choose one wisely. It might take time to master it, but it is definitely a good time investment.

R programming

Style guide

Consistency is key.
R installation.
Coding style: we follow the Bioconductor coding style. Also, use TRUE/FALSE instead of T/F.
Use <- for assignments.
Use a dotted function name for internal function: .internalFunction.
We generally prefer camel case. Snake case would be fine for a set of related internal helper functions: something like is_scalar_character, is_logical_character, ... that all return a logical(1). Never ever mix (exported) snake and camel case for one package. Remember, consistency is key.

Editor

Does your editor know R? If you use emacs, go for ess; if you use vim, look at the vim R plugin. See also RStudio.

OO programming

For OO programming, prefer S4 over S3. If relevant, use S4 Reference Classes. Consider R6, but discuss/motivate your choice.
Only use generics and methods when using a function is not possible at all, i.e. the same function name is used for different classes (within the same or different package).
Before writing a new generic, check if it doesn't already exists in BiocGenerics or ProtGenerics and consider asking the new generic to be added in one of those if relevant.

R package development

Use Authors@R to define authors and their respective roles in the DESCRIPTION file.
Making R packages: maker.
devtools and roxygen.
Note: roxygen is not only valuable in the frame of package development. Documenting a function with it outside of a package is recommended.
Use git/GitHub - all discussion about the package (software architecture, bugs, vignettes, ...) should be done through GitHub issues.
use maker to automate and standardise development.
Use a .Rbuildignore file to bundle only what it needed (see below)
Use testthat for unit testing.
Use covr for coverage.
Use travis-ci and codecov for continuous integration. Ideally also test on Windows using appveyor.
Use BiocStyle::html_document2() with floating toc for vignettes.
Write a README.md file. If it contains R code (that would be a good thing), use a README.Rmd files, with a pre-commit hook and use make README from maker to build the md file. The Rmd file should be added to .Rbuildignore so as to only keep the md file the file package bundle (tar ball).
Write a NEWS.md file. The NEWS file should then be generated from the latter using make NEWS from maker. NEWS.md to be added to .Rbuildignore and the latter to .gitignore(s).
Use pkgdown to generate the package webpage. Do not add the docs directory to the Bioconductor svn server (see https://lgatto.github.io/branch-specific-gitignore/ for details)
rOpenSci Packages: Development, Maintenance, and Peer Review
R packages by Hadley Wickham

General resources about research software and computing

Reproducible research

Before being reproducible, your research should be organised.
Sweave and knitr LaTeX and R Markdown [1, 2] vignettes.
orgmode (emacs only).
Make
How we make our papers replicable, by Titus Brown.
pandoc, the document converter.
Using knitr and pandoc to create reproducible scientific reports by Peter Humburg.
A Reproducibility Reading List

Resources

Teaching material
R packages by Hadley Wickham
Advanced R by Hadley Wickham
R Programming for Bioinformatics by Robert Gentleman (ask me for the pdf)
Managing Research Software Projects
rOpenSci Packages: Development, Maintenance, and Peer Review

Modern and digital scholarship

How to be a modern scientist is a nice introduction to many aspects of modern, open digital scholarship, that are valued and applied at CPU.

Lab meetings

We recently (November 2016) introduced official lab meetings in addition to more casual and daily interactions. Here's some advice on lab meeting code reviews.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly