Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odgi unchop performance for large graphs #584

Open
sivico26 opened this issue Jul 16, 2024 · 4 comments
Open

Odgi unchop performance for large graphs #584

sivico26 opened this issue Jul 16, 2024 · 4 comments

Comments

@sivico26
Copy link

Hello again,

I realized that even though my issue affects smoothxg, it is more concerned with an odgi algorithm, so I am putting it also here for future reference.

@AndreaGuarracino
Copy link
Member

Hi @sivico26 , I can't make promises, but if you can share one graph, I could look at where the bottleneck might be!

@sivico26
Copy link
Author

sivico26 commented Sep 4, 2024

Hi @AndreaGuarracino.

I can do that. Where should I send it? to your mail at uthsc?

By the way, the Job just eclipsed 2200h. At this scale, odgi unchop is definitively the bottleneck of smoothxg, taking more than 80% of the time (and counting). If we find a way to address this, people working on crops who want to include wild ancestors in their pangenomes will surely appreciate it.

My cluster's admins speculated that odgi unchop is $O(n^2)$ (on the number of nodes I imagine). Can you confirm if this is the case? do you know? Knowing this would help us to determine if we could wait for the job or if we should rather proceed with the input graph for our work.

@AndreaGuarracino
Copy link
Member

AndreaGuarracino commented Sep 4, 2024 via email

@sivico26
Copy link
Author

Hi @AndreaGuarracino,

My cluster's admins recently informed me that they need to shut down the server where my job is running. Thus, my job will be killed after running ~2600 hours.

Is there a way to resume smoothxg processing somewhere by copying the current temporary files? These are the files currently in the folder:

[sivico26@urga1 ~]$ ls /scratch/sivico26/job_6857522.cerit-pbs.cerit-sc.cz/results_uv/tmp/temp-27WbAw/ -lh 
total 108G
-rw-------. 1 sivico26 meta  98G jun 15 14:41 0LXmlE
-rw-------. 1 sivico26 meta 4,9G jun 10 14:38 E9L5T3
-rw-------. 1 sivico26 meta 4,9G jun 10 14:36 Gsu5sk
-rw-------. 1 sivico26 meta  12K jun 15 14:42 hAFKNe

What do you think, would they be of any use?
Thank you in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants