Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What do we need for a 1.0 release #64

Open
brooksph opened this issue Mar 17, 2018 · 7 comments
Open

What do we need for a 1.0 release #64

brooksph opened this issue Mar 17, 2018 · 7 comments

Comments

@brooksph
Copy link
Contributor

Expected behavior

Actual behavior

Steps to reproduce the behavior

@charlesreid1
Copy link
Member

charlesreid1 commented Mar 20, 2018

  • command line interface - a friendly and more structured way to run Snakemake rules (c.f. 2018-snakemake-cli)
  • tests (covered on page 5 of the report) - regression tests, fixing any critical software versions, additional test metrics
  • documentation - provided via command line utility help, and examples/walkthroughs

@brooksph
Copy link
Contributor Author

brooksph commented Mar 20, 2018

written and working:
Snakefiles

debugging:

to do:
Snakefiles

  1. Read filtering (https://github.com/dahak-metagenomics/dahak/blob/master/workflows/read_filtering/Snakefile)
  • Create rule for fastqc before trim
  • Create rule for fastqc after trim
  • Create rule for trimming with trimmomatic
  • Create rule for combining fastqc reports with multiqc
  • Create config file (maybe yaml or json to specify inputs and outputs)
  1. Assembly (https://github.com/dahak-metagenomics/dahak/blob/master/workflows/assembly/Snakefile)
  • Split snakefile (MEGAHIT, SPAdes, and MultiQC)
  • Snakefile 1: MEGAHIT assembly
    • Create rule for assembly with MEGAHIT
    • Create rule for assembly evaluation with quast
  • Snakefile 2: SPAdes
    • Create rule for assembly with SPADES
    • Create rule for assembly evaluation with quast
  • Snakefile 3: Multiqc assembly evaluation
    • Create rule for merging quast reports into single report
  1. Mapping and Variant calling (In progress Create read mapping and variant calling snakefile #46)

  2. Taxonomic classification (see https://github.com/charlesreid1/dahak-flot/blob/master/Snakefile and https://github.com/charlesreid1/dahak-flot/tree/master/rules)

  • Split snakefile (Sourmash and kaiju)
  • Snakefile 1: Taxonomic classification with sourmash
  • Snakefile 2: Taxonomic classification with kaiju
  1. Functional Inference
  • Snakefile 1: Functional annotation of reads/contigs with mi-faser
  • Snakefile 2: Identification of antibiotic resistance genes with ABRicate
  • Snakefile 3: Identification of antibiotic resistance genes with SRST2
  1. Metagenomic comparison
  • Snakefile 1: Comparison of sourmash sigs representing reads/contigs using sourmash compare

Documentation

Data set generation

  • Shakya subsets
    • 10
    • 25
    • 50
  • Shakya viral and Eukaryotic short read spike-in
  • Shakya viral and Eukaryotic long and short read spike-in

Potential punt for 1.X release

  • Jupyter notebooks with example analyses
    • Note: Many of these have been started and may be completed by 1.0 release but are not essential for the 1.0 release

@charlesreid1
Copy link
Member

A git clone of this repo currently takes upwards of 5 minutes on my machine. I think this is something we should solve before a 1.0 release. Would you be open to that? If so, I'll start a thread where we can discuss options.

@brooksph
Copy link
Contributor Author

Yes, we should address that. Please start an issue for that and feel free to add to the list.

@charlesreid1
Copy link
Member

charlesreid1 commented Mar 23, 2018

  • create a docs/ directory for the dahak sphinx documentation to live see dahak-taco
  • create a cli/ or taco/ directory for the dahak command line interface to live see dahak-taco
  • convert a few existing Snakefile workflows in the repository into the atomic rules-based format used in dahak-taco, so that they can be used from the command line

See rules/ dir of dahak-taco.

@charlesreid1
Copy link
Member

Also @brooksph here is the taco "project" that I mentioned: https://github.com/charlesreid1/dahak-taco/projects

@charlesreid1
Copy link
Member

Just to follow up here:

  • We released version 1.0 beta of dahak-taco on Friday 4/27
  • We presented taco 1.0beta at the DIB Lab and got some good feedback on the design, so we're working on incorporating that. Also, it seemed like taco scratches an itch that other software doesn't, so there was solid interest in the tool from lab members. Good sign!
  • We have solid documentation and have gotten good feedback on improvements we can make to the documentation
  • We have structured repos to make testing easier (separating taco repo from workflow repos, enabling testing taco separately from testing individual workflows)
  • We will be soliciting further rounds of feedback from other end users throughout the month of May in preparation for the 1.0 rollout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants