-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We need to speed up setMisc and getMisc #107
Comments
Hi Gregor, I noticed this problem. I worked on it quickly. Here is the code. I don't know if it is suitable for all uses.
Thank you for your EDx course "Breeding Programme Modelling with AlphaSimR". Have a nice day, Loup |
@LoupPiedfer , thanks for this solution! However, there seems to be some problem with "NULL"-s.
Having this,
Do you have any solution for this? |
Hi @janaobsteter, I did not correctly identify
Just one question: why did you manage It will radically change the way you deal with miscelaneous information. But it's faster and you can access information like
Have a nice day, Loup |
@LoupPiedfer , thanks! I'll let @gregorgorjanc reply, I am just pasting the results of your benchmarking here |
@LoupPiedfer yes, we are noticing major performance hit with the current design as indicated by @janaobsteter above. @gaynorr would you consider swapping the misc list order from ind-then-misc-items to misc-items-then-ind. I also think that this second order is not just faster, but also somewhat easier to think about for users - the current order requires users to be quite savvy with R. Happy to help with PRs to address this - should not be too much work. |
Also, if we move to misc-items-then-ind order, then missing value can simply be |
I am looking at required changes from ind-then-misc-items to misc-items-then-ind - they are very minimal, we can even remove
|
@gregorgorjanc, would you guys perhaps be better off using popMisc instead of misc to store your information? The point of misc being a list of length nInd is so that you can do subsetting and combining of populations without enforcing any rules about what is contained in the slot. AlphaSimR just applies the subsetting or combining operation to the list(s). This will work regardless of what is stored in the list. It also means that each individual doesn't have to have the same sort of information in this slot. When initially developing this structure, the use case under consideration was storing additional populations in the misc slot to represent mitochondria, chloroplasts, or sex chromosomes. If this structure gets changed, there has to be rules for subsetting and combining any data type that might be contained within the misc slot. This would force limiting what gets stored there or else the code will break. It also wouldn't work for the original intended use of this slot. The popMisc slot gives you the freedom to store a list of any length. However, this data doesn't persist after subsetting or combining populations. I guess the question is, are you guys using subsetting and/or combining with your populations? If not, the popMisc slot is clearly your best choice. |
We could probably add a function, like how the finalizePop function works, to combining and subsetting populations to handle the popMisc slot. It could default to the current functionality of wiping the slot, but off the ability for a user to add custom functionality. |
Never mind, this isn't as easy as I was thinking. I forgot that subsetting and combining occurs without the need for SimParam. Thus, there isn't a good way to open this up. |
Hmm, possibly. More below.
Yes, this is very beautiful in principle, but actually working with the ind * node structure is cumbersome because we usually need access to a node across individuals, hence
I have created a pull request with the change to node*ind list organisation. I have added test for simple vectors. Will see if the proposed functionality works with lists or even AlphaSimR Pop - for the later I think it will, likely also for a list!
It does, but as you say it doesn’t persist (for a good reason - population has changed, so miscellaneous information for the whole population has likely changed too), so this use case is different. |
In PR at #168 I have previously only worked with simple vectors in Now I have tried with a list inside as well In recent commits 193a136 and 21bf985 I show in tests how one can not work with Pop, list of Pop, or MultiPop in
|
I think having a vector, list, or MultiPop as options works. Having a list option would cover the current functionality. I'll look over the pull request a bit later. Should we remove |
Yes, the PR removes set/getMisc since node*ind is very straightforward to work with |
Flipping misc organisation to address #107
Everything should be in place now on the devel branch. |
setMisc and getMisc are very slow - we need to speed them up!
Any takers?
The text was updated successfully, but these errors were encountered: