Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The issue of matching between the expanded GTF and transcript types #194

Open
biochristmas opened this issue May 27, 2024 · 2 comments
Open

Comments

@biochristmas
Copy link

Hi, I aligned the transcript to the reference genome, and based on the coverage of reads, I found two alternative splicing events. However, the GTF file expanded by IsoQuant shows only one transcript when viewed through IGV. Thank you!
igv

@andrewprzh
Copy link
Collaborator

Dear @biochristmas

This two isoforms are quite hard to distinguish, since the second one is the a substing of the first one.
IsoQuant takes into account that some reads can be truncated, and thus considers reads from a shorter isoform simply as truncated versions of the longer isoform. If the difference was on the 3' end and two isoforms had distinct polyA sites, there would be a higher chance of detecting both of them.

We are working on improving the algorithms for correctly detecting 5' and 3' ends, but this case seems quite non-trivial. Using such reads in other cases may lead to a high number of false positives.

You may try using --fl_data option, but I don't think it will make a difference in this case.

Best
Andrey

@biochristmas
Copy link
Author

Thank you for your reply. I also tried the '--fl_data' parameter today, and the number of transcripts in the GTF file is the same as when not using '--fl_data'. There is indeed no difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants