Skip to content

Commit

Permalink
Merge pull request #75 from FelixKrueger/dev
Browse files Browse the repository at this point in the history
Updating docs and Switch over to v8 MGP annotation
  • Loading branch information
FelixKrueger committed Jan 6, 2023
2 parents 430f3c7 + 57f80d5 commit 0d59a2b
Show file tree
Hide file tree
Showing 21 changed files with 1,127 additions and 712 deletions.
20 changes: 20 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: docs
on:
push:
branches:
- master
- dev
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: 3.x
- uses: actions/cache@v2
with:
key: ${{ github.ref }}
path: .cache
- run: pip install mkdocs-material pillow cairosvg
- run: mkdocs gh-deploy --force
10 changes: 8 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
## v0.5.0dev
## v0.6.0 (Release 07 01 2023)

- Restructured the documentation, using `mkdocs`. The new User Guide lives at this address: http://felixkrueger.github.io/SNPsplit/

- Reworked all of SNPsplit to reflect changes of the [Mouse Genomes Project](https://www.mousegenomes.org/). This includes the overdue switch-over to the latest v8 annotation (available [here](https://ftp.ebi.ac.uk/pub/databases/mousegenomes/REL-2112-v8-SNPs_Indels/)) and the GRCm39 mouse genome build

- Kept the old v5 (and v7) genome build instructions as [legacy documentation](http://felixkrueger.github.io/SNPsplit/genome_prep/legacy/)

### SNPsplit

Added an option `--single_end` to skip the paired-end auto-detection entirely (which failed for e.g. alignments with STAR [see here](https://github.com/FelixKrueger/SNPsplit/pull/56)).
- Added an option `--single_end` to skip the paired-end auto-detection entirely (which failed for e.g. alignments with STAR [see here](https://github.com/FelixKrueger/SNPsplit/pull/56))

### SNPsplit_genome_preparation

Expand Down
24 changes: 7 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,15 @@
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/snpsplit/README.html)

<p align="center"> <img title="SNPsplit" id="logo_img" src="Images/SNPsplit.png" width=300></p>
<p align="center"> <img title="SNPsplit" id="logo_img" src="docs/images/SNPsplit.png" width=300></p>

## Allele-specific alignment sorting

> **See the documentation: https://felixkrueger.github.io/SNPsplit**
**Update December 2022:**

SNPsplit has now been updated to work with new release of the [Mouse Genomes Project](https://www.mousegenomes.org/). This means that it will now assume the GRCm39 mouse genome build by default, and use the latest SNP annotation file (v8: [mgp_REL2021_snps.vcf.gz](https://ftp.ebi.ac.uk/pub/databases/mousegenomes/REL-2112-v8-SNPs_Indels/mgp_REL2021_snps.vcf.gz)).

### Note for using a UCSC/NCBI genome in conjunction with the VCF file from the Mouse Genomes Project

Several users have run into problems when using genomes from UCSC or NCBI in conjunction with the VCF file from the Mouse Genomes Project (MGP, https://www.mousegenomes.org/).
The reason for this is that the MGP uses chromosomal coordinates from Ensembl (i.e. `1, 2, 3, X, MT`) whereas UCSC uses chromosome names that look like this: `chr1, chr2, chr3, chrX, chrM`.

We have recently added a check to the SNPsplit genome preparation script that will bail if a chromosome name discrepancy is detected (https://github.com/FelixKrueger/SNPsplit/issues/4). It is however possible to convert the VCF file into a UCSC compatible version by
(a) changing the chromosome name from e.g. `1` to `chr1` and (b) adding changing the chromosome names in the ID field of the VCF file headers. It is normally not necessary to change the name of the mitochondrium from `MT` to `chrM` because no SNP positions are recorded for the MT anyway.

Here is a one line `awk` script that does an Ensembl=>UCSC conversion, but you could of course also run an equivalent script in Python or Perl...
```
awk '{if($1 ~ "^#") {gsub("contig=<ID=", "contig=<ID=chr"); gsub("contig=<ID=chrMT", "contig=<ID=chrM"); print} else {gsub("^MT", "M"); print "chr"$0}}' mgp_REL2021_snps.vcf.gz
```

## Installation

Expand All @@ -29,16 +19,16 @@ SNPsplit requires the following tools installed and ideally available in the `PA
- [Samtools](http://samtools.sourceforge.net/)

## Documentation
The SNPsplit documentation can be found here: [SNPsplit User Guide](./SNPsplit_User_Guide.md)
The SNPsplit documentation can be found here: [SNPsplit User Guide](https://felixkrueger.github.io/SNPsplit)

## Links
- SNPsplit publication at F1000 Research:
* https://f1000research.com/articles/5-1479/v2

- Here is a link to the [SNPsplit project site](https://www.bioinformatics.babraham.ac.uk/projects/SNPsplit/) at the Babraham Institute.
- Here is a link to the [SNPsplit project site](https://www.bioinformatics.babraham.ac.uk/projects/SNPsplit/) at the Babraham Institute

## Credits

SNPsplit was written by Felix Krueger, as part of the [Babraham Bioinformatics](https://www.bioinformatics.babraham.ac.uk) group.
SNPsplit was written by Felix Krueger at [Babraham Bioinformatics](https://www.bioinformatics.babraham.ac.uk), now part of Altos Bioinformatics.

<p align="center"> <img title="Babraham Bioinformatics" id="logo_img" src="Images/bioinformatics_logo.png" width=300></p>
<p align="center"> <img title="Babraham Bioinformatics" id="logo_img" src="docs/images/bioinformatics_logo.png" width=200></p>
6 changes: 3 additions & 3 deletions SNPsplit
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ use Cwd;
## along with this program. If not, see <http://www.gnu.org/licenses/>.

## Reading in a BAM or SAM file
my $pipeline_version = '0.5.1dev';
my $pipeline_version = '0.6.0';
my $parent_dir = getcwd;
my $full_commandline = join (" ","SNPsplit",@ARGV);
# warn "Full commandline: $full_commandline\n"; sleep(5);
Expand Down Expand Up @@ -1788,8 +1788,8 @@ sub process_commandline{
SNPsplit - Allele-specific alignment sorting
Version: $pipeline_version
Copyright 2014-22 Felix Krueger
Babraham Bioinformatics
Copyright 2014-23 Felix Krueger
Altos Bioinformatics
https://github.com/FelixKrueger/SNPsplit
Expand Down
Loading

0 comments on commit 0d59a2b

Please sign in to comment.