Skip to content

Package Conventions

Andrew Butler edited this page Jun 11, 2018 · 15 revisions

Want to add a code to Seurat? This document is intended to provide a description of the expected behavior/naming conventions for functions and their parameters.

Code Style

In general, we try to follow Google's R Style Guide, with the following exceptions and additions:

  • Use curly braces for every if/else statement, for/while loop, and function definition (including anonymous for use in apply)
  • Function documentation should use Roxygen syntax
  • All arguments, except the ..., should be named in function calls (eg. use print(x = 'hello') instead of print('hello'))

Diagnostic Output

Parameter Name

In the function definition, include a verbose parameter to allow the user to specify whether to print the output. This should take a boolean.

Printing to console

If the function is write any output to the console, there are several options to choose from to print messages. Here's what you should use and when:

  • message: Should be the default, except in the cases listed below.
  • cat: Use when designing a show method or any print.* S3 methods.
  • warning: Function is allowed to proceed but user should be notified (higher priority notification than message)
  • stop: Function should quit and print this message.
  • print: Never use print

Progress Bars

Progress bars are great. If you want to add a progress bar to a function that uses:

For Loops

initialize the progress bar with:

pb <- txtProgressBar(char = '=', style = 3)

And update in each loop iteration with:

setTxtProgressBar(pb = pb, value = i)

apply

Use functions from the pbapply package.

c++ code

Use the RcppProgress package. Include the header file in the .cpp file.

#include <progress.hpp>
// [[Rcpp::depends(RcppProgress)]]

In the function, create the progress bar with

Progress p(num_iterations, display_progress); 

and update with

p.increment();

Plotting Functions

Plotting functionality should be kept separate from any computation where possible. If the function can optionally produce plots (e.g. FindVariableGenes), the parameter controlling this should be named do.plot. When possible, plotting functions should use the ggplot2 framework over base R or other custom plotting frameworks.

Return value

All plotting functions that use ggplot2 to generate the plots should return the ggplot object. For constructing composite ggplot plots , we recommend using the plot_grid function from the cowplot package.

Parameter options for all plots

All plotting functions should have the following parameter options (if applicable) to allow for easy manipulation of the plot:

  • plot.title - accept a string for the plot title
  • pt.size - accepts a numeric to adjust the point size
  • reduction.use - accepts a string to select the dimensional reduction to use
  • group.by - accepts a string to set the grouping variable
  • remove.legend - accepts a boolean to toggle the legend off, default should be FALSE
  • legend.position - accepts a string to change the legend position
  • dark.theme - accepts a boolean to toggle on the dark theme, default should be FALSE
  • cells.use - accepts a vector of cell names to allowing plotting a subset of cells

Accessing and Modifying internal slots

All access to internal Seurat object slots should be done though the provided accessor(GetXXX)/mutator(SetXXX) functions.

Common Parameter Names

  • seed.use Setting any random seed. Preferred default is 42.