Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate ortholog relationships in Reactome #38

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Conversation

cthoyt
Copy link
Member

@cthoyt cthoyt commented Dec 31, 2020

This PR adds a script for identifying orthologous pathways in Reactome based on lexical matching. All pathway identifiers have the form R-{SPECIES CODE}-{PATHWAY CODE}, so any two species' pathways can be matched by splitting on the dash - then maching pathways codes.

The KEGG and Reactome matching are correct by definition, but do not exist in any primary source, so could be added directly. This PR doesn't (yet) contain the results because I'm not really sure if this should be in scope of the repo. There are quite a few (30K+) to add.

@cthoyt cthoyt changed the title Add scripts for generating ortholog relations Calculate ortholog relationships in Reactome Aug 19, 2021
@cthoyt
Copy link
Member Author

cthoyt commented Aug 19, 2021

@bgyori this has been cleaned up and now has a more specific scope. What do you think about adding these ~100K orthology mappings between Reactome pathways? In comparison to the 5K we've done manually, this vastly changes the scale of the data here. Do you think this is a problem? Maybe we should have yet another category for large scale calculated mappings that can automatically be trusted (as opposed to predicted, which still need to be checked) so we can keep these separate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant