Memory error when `store_window=True` in `pspec_run` #354

adeliegorce · 2022-04-20T14:28:18Z

The option in pspec_run that allows to compute and store (approximate) window functions leads to a memory overload.
For example, for two datasets being combined to give a PSPec object with uvp.Ntimes = 8052, uvp.Nblpairs = 63, uvp.Npols = 1 and one spectral window (164 frequency channels), the memory usage between a case with and without window functions is about 25Gb.
Note: uvp.window_function_array is defined as an array of float64.
First spotted in the validation notebook hera-validation/test-series/0/test-0.2.0.ipynb.

The text was updated successfully, but these errors were encountered:

nkern · 2022-04-20T14:46:13Z

This is kind of an insurmountable issue if one wants to compute the windows func for all baselines and times. For many estimators however the wf doesn’t change along these axes, so in the H1C analysis for example we run the pipeline w store_window False, run our bl and time averaging, then go back and compute the wf once.

I think leaving the current capability to compute and store all bltimes is needed for future estimators that may not have time abs bl independent wf, but if you want to take this problem on we could write a method that identifies time and blpaor independent estimators and just stores one or a few wfs in memory

adeliegorce · 2022-04-20T14:48:27Z

Okay, I see. I guess I was thinking about the second option you mention where you do not store the same array a thousand times but it might be difficult to make general enough. Also, do you think it needs to be float64 and not 32?

nkern · 2022-04-20T15:07:48Z

Yeah I think it should be fairly straightforward to do this.

Two things can be done to make it easier to store blpair and time-independent window funcs.

for time independent WF shrink the "time" axis of the wf to 1 in memory (i.e. the uvp.window_function_array). however, when you go to actually call the window func in uvp.get_window_func it is automatically inflated to the full Ntimes for you. This could be directed by a uvp.time_dep_wf=False flag, which would be set by the user in pspec run.
for blpair independent WF this could be as simple as just storing the wf for a single blpair, and in the get_window_func method you simply assume that only 1 blpair is stored and modify the blpairts indexing array to accomodate that. this would require that uvpspec hold only a single WF per object, which could similarly be directed by a uvp.blpair_dep_wf=False flag.

These changes might require updates to the check function in the limit of the above kwargs being False.
lmk if all of this makes sense. no pressure to actually do this if you are not able to take it on--you can always do the hack I did for the H1C analysis, but if you'd like to see this handled more elegantly in uvpspec then this is certainly one way to go. lmk, thanks!

adeliegorce · 2022-04-20T15:10:43Z

Thanks for the suggestions! I'll keep it in the back of my head and do it eventually :)

adeliegorce added enhancement optimization labels Apr 20, 2022

adeliegorce self-assigned this Apr 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory error when `store_window=True` in `pspec_run` #354

Memory error when `store_window=True` in `pspec_run` #354

adeliegorce commented Apr 20, 2022

nkern commented Apr 20, 2022

adeliegorce commented Apr 20, 2022

nkern commented Apr 20, 2022

adeliegorce commented Apr 20, 2022

Memory error when store_window=True in pspec_run #354

Memory error when store_window=True in pspec_run #354

Comments

adeliegorce commented Apr 20, 2022

nkern commented Apr 20, 2022

adeliegorce commented Apr 20, 2022

nkern commented Apr 20, 2022

adeliegorce commented Apr 20, 2022

Memory error when `store_window=True` in `pspec_run` #354

Memory error when `store_window=True` in `pspec_run` #354