-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline Test files #216
Comments
My apologies for my late reply! Generally we use Grandeur with fasta files for two things:
I don't have this built in to Grandeur (it's a long story, but a lot of sites are blocked locally - such as the ENA) For phylogenetic analysis, this is what we use for testing with github actions (I'm making the assumption you're curious about the phylogenetic analysis): Step 1. Get fasta files for the same species (they need to share 1500 genes)mkdir fastas
cd fastas
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/013/783/245/GCA_013783245.1_ASM1378324v1/GCA_013783245.1_ASM1378324v1_genomic.fna.gz && gzip -d GCA_013783245.1_ASM1378324v1_genomic.fna.gz
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/026/626/185/GCA_026626185.1_ASM2662618v1/GCA_026626185.1_ASM2662618v1_genomic.fna.gz && gzip -d GCA_026626185.1_ASM2662618v1_genomic.fna.gz
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/020/808/985/GCA_020808985.1_ASM2080898v1/GCA_020808985.1_ASM2080898v1_genomic.fna.gz && gzip -d GCA_020808985.1_ASM2080898v1_genomic.fna.gz
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/904/863/225/GCA_904863225.1_KSB1_6J/GCA_904863225.1_KSB1_6J_genomic.fna.gz && gzip -d GCA_904863225.1_KSB1_6J_genomic.fna.gz
cd ../ Step 2A. Then run the workflownextflow run . -profile docker,msa --fastas fastas OR Step 2B. Create the list of fastas and then run the workflowInstead of pointing the workflow to a directory, a list of fasta files can be used instead. This must be the option used if using cloud resources. Creating the fasta listls fastas/* > fastas.txt fastas.txt should have file contents like so fastas/GCA_013783245.1_ASM1378324v1_genomic.fna
fastas/GCA_026626185.1_ASM2662618v1_genomic.fna
fastas/GCA_020808985.1_ASM2080898v1_genomic.fna
fastas/GCA_904863225.1_KSB1_6J_genomic.fna Running the workflownextflow run . -profile docker,msa --fasta_list fastas.txt Step 3. Looking at results:This gives a summary file with 1-2 key results from each analysis.
There is also a newick file generated with iqtree2: (GCA_020808985.1_ASM2080898v1_genomic:0.0038882302,(((GCA_013783245.1_ASM1378324v1_genomic:0.0035602662,GCA_026626185.1_ASM2662618v1_genomic:0.0030635049)67.8/75:0.0004092349,Klebsiella_pneumoniae_GCF_000240185.1:0.0032600322)100/100:0.0009262356,Klebsiella_pneumoniae_GCF_022869665.1:0.0046128026)99.6/99:0.0005900806,GCA_904863225.1_KSB1_6J_genomic:0.0039588357); A SNP matrix generated via SNP dists: snp-dists 0.8.2,GCA_020808985.1_ASM2080898v1_genomic,GCA_013783245.1_ASM1378324v1_genomic,Klebsiella_pneumoniae_GCF_022869665.1,GCA_026626185.1_ASM2662618v1_genomic,Klebsiella_pneumoniae_GCF_000240185.1,GCA_904863225.1_KSB1_6J_genomic
GCA_020808985.1_ASM2080898v1_genomic,0,26202,26340,24554,25128,24777
GCA_013783245.1_ASM1378324v1_genomic,26202,0,26648,21896,22221,26669
Klebsiella_pneumoniae_GCF_022869665.1,26340,26648,0,26209,26246,26393
GCA_026626[18](https://github.com/UPHL-BioNGS/Grandeur/actions/runs/9766935718/job/26961027547#step:5:19)5.1_ASM2662618v1_genomic,24554,21896,26209,0,20967,25626
Klebsiella_pneumoniae_GCF_000240185.1,25128,22221,26246,20967,0,25609
GCA_904863225.1_KSB1_6J_genomic,24777,26669,26393,25626,25609,0 And more. More information can be found on our wiki pages https://github.com/UPHL-BioNGS/Grandeur/wiki/Phylogenetic-Analysis, https://github.com/UPHL-BioNGS/Grandeur/wiki/USAGE#fasta-files, and https://github.com/UPHL-BioNGS/Grandeur/wiki/phylogenetic_analysis. |
Did this work for you? |
Hello Developer,
Can you please provide at least a test.config with test fasta files to run this pipeline and understand the output ?
Thank you.
The text was updated successfully, but these errors were encountered: