Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Biohansel does not always correctly pair dataset collections in output #217

Open
DarianHole opened this issue Apr 7, 2022 · 0 comments
Open
Labels

Comments

@DarianHole
Copy link
Member

Not having an extension (ex .fastq) when making a collection of reads as an input for biohansel results in the final output not having the files paired even though the dataset collection does.

If I had to guess (and if I remember correctly) this is likely due to how the get_paired_fastq_filename function interacts with the $input.paired_collection.<forward|reverse> name which I believe utilizes the underlying name from the dataset used to make the collection and not the name in the collection itself

Example follows:

Dataset 1 -> Correctly pairs output:

  • File names used to make up the paired collection:

    • TestX_R1.fastq && TestX_R2.fastq
    • TestY_R1.fastq && TestY_R2.fastq
  • Paired Collection (looks the exact same as Dataset 2:
    image

  • Output:

TestX | heidelberg | 0.5.0

Dataset 2 -> Outputs are separated

  • File names used to make up the paired collection:

    • TestX_R1 && TestX_R2
    • TestY_R1 && TestY_R2
  • Paired Collection:
    image

  • Output:

TestX_R1 | heidelberg | 0.5.0
TestX_R2 | heidelberg | 0.5.0
@DarianHole DarianHole added the bug label Apr 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant