Skip to content

Commit

Permalink
Merge pull request #325 from nf-core/dev
Browse files Browse the repository at this point in the history
Release 4.0
  • Loading branch information
ggabernet committed Apr 23, 2024
2 parents 5c9a30b + 8e90b1f commit 2f492b0
Show file tree
Hide file tree
Showing 45 changed files with 203 additions and 173 deletions.
25 changes: 25 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,31 @@
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [4.0] - 2024-04-22 Ascendio

### `Added`

- [#319](https://github.com/nf-core/airrflow/pull/319) Added AIRR compliance badge

### `Fixed`

- [#319](https://github.com/nf-core/airrflow/pull/319) Fix test full profile and nebnext_umi_tcr profile.
- [#321](https://github.com/nf-core/airrflow/pull/321) Label Dowser tips by isotype instead of c_call by default.
- [#322](https://github.com/nf-core/airrflow/pull/322) Use RAxML as the default builder for dowser. Skip lineage trees by default.

### `Dependencies`

| Dependency | Old version | New version |
| ---------- | ----------- | ----------- |
| enchantr | 0.1.11 | 0.1.14 |

### `Deprecated parameters`

- `--skip_lineage_trees` is now deprecated in favor of `--lineage_trees`. Lineage trees are skipped by default.
- `--igphyml` parameter is deprecated in favor of `--lineage_tree_exec`. All lineage tree building software part of Dowser are now supported.
- `--igblast_base` is deprecated in favor of `--reference_igblast`.
- `--imgtdb_base` is depracated in favor of `--reference_fasta`.

## [3.3.0] - 2024-03-31 Confringo

### `Added`
Expand Down
36 changes: 4 additions & 32 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,42 +32,10 @@

> Gupta, N. T., Vander Heiden, J. A., Uduman, M., Gadala-Maria, D., Yaari, G., & Kleinstein, S. H. (2015). Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data: Table 1. Bioinformatics, 31(20), 3356–3358.
- [Alakazam](https://doi.org/10.1126/scitranslmed.3008879)

> Stern, J. N. H., Yaari, G., Vander Heiden, J. A., Church, G., Donahue, W. F., Hintzen, R. Q., … O’Connor, K. C. (2014). B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes. Science Translational Medicine, 6(248).
- [SCOPer](https://doi.org/10.1093/bioinformatics/bty235)

> Nouri N, Kleinstein S (2018). “A spectral clustering-based method for identifying clones from high-throughput B cell repertoire sequencing data.” Bioinformatics, i341-i349.
> Nouri N, Kleinstein S (2020). “Somatic hypermutation analysis for improved identification of B cell clonal families from next-generation sequencing data.” PLOS Computational Biology, 16(6), e1007977.
> Gupta N, Adams K, Briggs A, Timberlake S, Vigneault F, Kleinstein S (2017). “Hierarchical clustering can identify B cell clones with high confidence in Ig repertoire sequencing data.” The Journal of Immunology, 2489-2499.
- [Dowser](https://doi.org/10.1371/journal.pcbi.1009885)

> Hoehn K, Pybus O, Kleinstein S (2022). “Phylogenetic analysis of migration, differentiation, and class switching in B cells.” PLoS Computational Biology.
- [IgPhyML](https://www.pnas.org/doi/10.1073/pnas.1906020116)

> Hoehn K, Van der Heiden J, Zhou J, Lunter G, Pybus O, Kleinstein S (2019). “Repertoire-wide phylogenetic models of B cell molecular evolution reveal evolutionary signatures of aging and vaccination.” PNAS.
- [IgBLAST](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692102/)

> Ye J, Ma N, Madden TL, Ostell JM. (2013). IgBLAST: an immunoglobulin variable domain sequence analysis tool. Nucleic Acids Res.
- [Fastp](https://doi.org/10.1093/bioinformatics/bty560)

> Shifu Chen, Yanqing Zhou, Yaru Chen, Jia Gu, fastp: an ultra-fast all-in-one FASTQ preprocessor, Bioinformatics. 2018 Sept 1; 34(17), i884–i890. doi: 10.1093/bioinformatics/bty560.
- [pRESTO](https://doi.org/10.1093/bioinformatics/btu138)

> Vander Heiden, J. A., Yaari, G., Uduman, M., Stern, J. N. H., O’Connor, K. C., Hafler, D. A., … Kleinstein, S. H. (2014). pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics, 30(13), 1930–1932.
- [SHazaM, Change-O](https://doi.org/10.1093/bioinformatics/btv359)

> Gupta, N. T., Vander Heiden, J. A., Uduman, M., Gadala-Maria, D., Yaari, G., & Kleinstein, S. H. (2015). Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. Bioinformatics, 31(20), 3356–3358.
- [Alakazam](https://doi.org/10.1126/scitranslmed.3008879)

> Stern, J. N. H., Yaari, G., Vander Heiden, J. A., Church, G., Donahue, W. F., Hintzen, R. Q., … O’Connor, K. C. (2014). B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes. Science Translational Medicine, 6(248).
Expand All @@ -88,6 +56,10 @@

> Hoehn K, Van der Heiden J, Zhou J, Lunter G, Pybus O, Kleinstein S (2019). “Repertoire-wide phylogenetic models of B cell molecular evolution reveal evolutionary signatures of aging and vaccination. PNAS, 116(45) 22664-22672."
- [RAxML](10.1093/bioinformatics/btu033)

> Stamatakis A. (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9): 1312-1313.
- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)

> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
Expand Down
19 changes: 15 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
[![Follow on Twitter](http://img.shields.io/badge/twitter-%40nf__core-1DA1F2?labelColor=000000&logo=twitter)](https://twitter.com/nf_core)
[![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core)
[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core)
[![AIRR compliant](https://img.shields.io/static/v1?label=AIRR-C%20sw-tools%20v1&message=compliant&color=008AFF&labelColor=000000&style=plastic)](https://docs.airr-community.org/en/stable/swtools/airr_swtools_standard.html)

## Introduction

Expand All @@ -32,7 +33,7 @@ On release, automated continuous integration tests run the pipeline on a full-si

## Pipeline summary

nf-core/airrflow allows the end-to-end processing of BCR and TCR bulk and single cell targeted sequencing data. Several protocols are supported, please see the [usage documentation](https://nf-co.re/airrflow/usage) for more details on the supported protocols.
nf-core/airrflow allows the end-to-end processing of BCR and TCR bulk and single cell targeted sequencing data. Several protocols are supported, please see the [usage documentation](https://nf-co.re/airrflow/usage) for more details on the supported protocols. The pipeline has been certified as [AIRR compliant](https://docs.airr-community.org/en/stable/swtools/airr_swtools_compliant.html) by the AIRR community, which means that it is compatible with downstream analysis tools also supporting this format.

![nf-core/airrflow overview](docs/images/metro-map-airrflow.png)

Expand All @@ -58,7 +59,7 @@ nf-core/airrflow allows the end-to-end processing of BCR and TCR bulk and single

2. V(D)J annotation and filtering (bulk and single-cell)

- Assign gene segments with `IgBlast` using the IMGT database (`Change-O AssignGenes`).
- Assign gene segments with `IgBlast` using a germline reference (`Change-O AssignGenes`).
- Annotate alignments in AIRR format (`Change-O MakeDB`)
- Filter by alignment quality (locus matching v_call chain, min 200 informative positions, max 10% N nucleotides)
- Filter productive sequences (`Change-O ParseDB split`)
Expand All @@ -80,8 +81,8 @@ nf-core/airrflow allows the end-to-end processing of BCR and TCR bulk and single
4. Clonal analysis (bulk and single-cell)

- Find threshold for clone definition (`SHazaM`, `EnchantR`).
- Create germlines and define clones, repertoire analysis (`Change-O`, `EnchantR`).
- Build lineage trees (`SCOPer`, `IgphyML`, `EnchantR`).
- Create germlines and define clones, repertoire analysis (`SCOPer`, `EnchantR`).
- Build lineage trees (`Dowser`, `IgphyML`, `RAxML`, `EnchantR`).

5. Repertoire analysis and reporting

Expand Down Expand Up @@ -124,6 +125,16 @@ nextflow run nf-core/airrflow \
--outdir ./results
```

For common **bulk sequencing protocols** we provide pre-set profiles that specify primers, UMI length, etc for common commercially available sequencing protocols. Please check the [Supported protocol profiles](#supported-protocol-profiles) for a full list of available profiles. An example command running the NEBNext UMI protocol profile with docker containers is:

```bash
nextflow run nf-core/airrflow \
-profile nebnext_umi,docker \
--mode fastq \
--input input_samplesheet.tsv \
--outdir results
```

A typical command to run the pipeline from **single cell raw fastq files** (10X genomics) is:

```bash
Expand Down
2 changes: 1 addition & 1 deletion assets/multiqc_config.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
report_comment: >
This report has been generated by the <a href="https://github.com/nf-core/airrflow/releases/tag/3.3.0" target="_blank">nf-core/airrflow</a>
This report has been generated by the <a href="https://github.com/nf-core/airrflow/releases/tag/4.0" target="_blank">nf-core/airrflow</a>
analysis pipeline. For information about how to interpret these results, please see the
<a href="https://nf-co.re/airrflow" target="_blank">documentation</a>.
Expand Down
4 changes: 4 additions & 0 deletions assets/repertoire_comparison.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -423,6 +423,10 @@ In addition, citations for the tools and data used in this pipeline are as follo

> Hoehn K, Van der Heiden J, Zhou J, Lunter G, Pybus O, Kleinstein S (2019). “Repertoire-wide phylogenetic models of B cell molecular evolution reveal evolutionary signatures of aging and vaccination.” PNAS.
- [RAxML](10.1093/bioinformatics/btu033)

> Stamatakis A. (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9): 1312-1313.
- [TIgGER](https://doi.org/10.1073/pnas.1417683112)

> Gadala-maria, D., Yaari, G., Uduman, M., & Kleinstein, S. H. (2015). Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles. Proceedings of the National Academy of Sciences, 112(8), 1–9.
Expand Down
1 change: 0 additions & 1 deletion conf/clontech_umi_tcr.config
Original file line number Diff line number Diff line change
Expand Up @@ -40,5 +40,4 @@ params {

// TCR options
clonal_threshold = 0
skip_lineage = true
}
7 changes: 3 additions & 4 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -562,10 +562,9 @@ process {
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
ext.args = ['build':'igphyml',
'minseq':5,
'traits':'c_call',
'tips':'c_call']
ext.args = ['minseq':5,
'traits':'isotype',
'tips':'isotype']
}

// -------------------------------
Expand Down
1 change: 0 additions & 1 deletion conf/nebnext_umi_tcr.config
Original file line number Diff line number Diff line change
Expand Up @@ -37,5 +37,4 @@ params {

//TCR options
clonal_threshold = 0
skip_lineage
}
5 changes: 3 additions & 2 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ params {
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/Metadata_test_airr.tsv'
cprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/C_primers.fasta'
vprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/V_primers.fasta'
imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'
reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'

mode = 'fastq'

Expand All @@ -35,6 +35,7 @@ params {
umi_position = 'R1'
index_file = true
isotype_column = 'c_primer'
lineage_trees = true
}

process{
Expand Down
5 changes: 3 additions & 2 deletions conf/test_assembled_hs.config
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,14 @@ params {
// Input data
mode = 'assembled'
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-reveal/test_assembled_metadata_hs.tsv'
imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'
reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'

reassign = true
productive_only = true
collapseby = 'filename'
cloneby = 'subject_id'
remove_chimeric = true
lineage_trees = true
}

5 changes: 2 additions & 3 deletions conf/test_assembled_immcantation_devel_hs.config
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,8 @@ params {
// Input data
mode = 'assembled'
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-reveal/test_assembled_metadata_hs.tsv'
imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'
igphyml = '/usr/local/share/igphyml/src/igphyml'
reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'

reassign = true
productive_only = true
Expand Down
5 changes: 2 additions & 3 deletions conf/test_assembled_immcantation_devel_mm.config
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,8 @@ params {
// Input data
mode = 'assembled'
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-reveal/test_assembled_metadata_mm.tsv'
imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'
igphyml = '/usr/local/share/igphyml/src/igphyml'
reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'

reassign = true
productive_only = true
Expand Down
6 changes: 4 additions & 2 deletions conf/test_assembled_mm.config
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,15 @@ params {
// Input data
mode = 'assembled'
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-reveal/test_assembled_metadata_mm.tsv'
imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'
reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'

reassign = true
productive_only = true
collapseby = 'filename'
cloneby = 'subject_id'
remove_chimeric = true

lineage_trees = true
}

5 changes: 2 additions & 3 deletions conf/test_clontech_umi.config
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,9 @@ params {
// Input data
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-clontech/samplesheet.tsv'

imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'
reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'

clonal_threshold = 0.1
skip_lineage = true

}
27 changes: 21 additions & 6 deletions conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -18,22 +18,37 @@ params {
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/metadata_pcr_umi_airr_300.tsv'
cprimers = 's3://ngi-igenomes/test-data/airrflow/pcr_umi/cprimers.fasta'
vprimers = 's3://ngi-igenomes/test-data/airrflow/pcr_umi/vprimers.fasta'
imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'
reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'

lineage_trees = true

// Other params
library_generation_method = 'specific_pcr_umi'
cprimer_position = 'R1'
umi_length = 15
umi_start = 0
umi_position = 'R1'
isotype_column = 'c_primer'
}

process {
withName:DOWSER_LINEAGES{
ext.args = ['build':'igphyml',
'minseq':5,
'traits':'c_primer',
'tips':'c_primer']
ext.args = ['minseq':5,
'traits':'isotype',
'tips':'isotype']
}

withName:DEFINE_CLONES_COMPUTE{
ext.args = ['outname':'', 'model':'hierarchical',
'method':'nt', 'linkage':'single',
'min_n':30]

}
withName:DEFINE_CLONES_REPORT{
ext.args = ['outname':'', 'model':'hierarchical',
'method':'nt', 'linkage':'single',
'min_n':30]

}
}
5 changes: 2 additions & 3 deletions conf/test_nebnext_umi.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,9 @@ params {
// Input data
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-neb/samplesheet.tsv'

imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'
reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'

clonal_threshold = 0.1
skip_lineage = true

}
4 changes: 2 additions & 2 deletions conf/test_no_umi.config
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ params {
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-no-umi/Metadata_test-no-umi_airr.tsv'
cprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-no-umi/Greiff2014_CPrimers.fasta'
vprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-no-umi/Greiff2014_VPrimers.fasta'
imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'
reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'


}
Expand Down
4 changes: 2 additions & 2 deletions conf/test_nocluster.config
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ params {
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/Metadata_test_airr.tsv'
cprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/C_primers.fasta'
vprimers = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-bcr/V_primers.fasta'
imgtdb_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
igblast_base = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'
reference_fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip'
reference_igblast = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip'

mode = 'fastq'

Expand Down
Loading

0 comments on commit 2f492b0

Please sign in to comment.