You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While profiling the LST-binner over 40 days of H6C data, I found that out of the ~20k total seconds taken, almost 5k (1.5 hours!) were spent in the get_blt_slices function. This seems somewhat unnecessary. For reference, here's the output from line-profiler for the function:
Total time: 4757.35 s
File: /lustre/aoc/projects/hera/heramgr/anaconda3/envs/h6c/lib/python3.10/site-packages/hera_cal/io.py
Function: get_blt_slices at line 422
Line # Hits Time Per Hit % Time Line Contents
==============================================================
422 def get_blt_slices(uvo, tried_to_reorder=False):
423 '''For a pyuvdata-style UV object, get the mapping from antenna pair to blt slice.
424 If the UV object does not have regular spacing of baselines in its baseline-times,
425 this function will try to reorder it using UVData.reorder_blts() to see if that helps.
426
427 Arguments:
428 uvo: a "UV-Object" like UVData or baseline-type UVFlag. Blts may get re-ordered internally.
429 tried_to_reorder: used internally to prevent infinite recursion
430
431 Returns:
432 blt_slices: dictionary mapping anntenna pair tuples to baseline-time slice objects
433 '''
434 4799 10367.0 2.2 0.0 blt_slices = {}
435 42100895 230656109.0 5.5 4.8 for ant1, ant2 in uvo.get_antpairs():
436 42096096 3186135020.0 75.7 67.0 indices = uvo.antpair2ind(ant1, ant2)
437 42096096 77943582.0 1.9 1.6 if len(indices) == 1: # only one blt matches
438 617976 3585403.0 5.8 0.1 blt_slices[(ant1, ant2)] = slice(indices[0], indices[0] + 1, uvo.Nblts)
439 41478120 986563753.0 23.8 20.7 elif not (len(set(np.ediff1d(indices))) == 1): # checks if the consecutive differences are all the same
440 if not tried_to_reorder:
441 uvo.reorder_blts(order='time')
442 return get_blt_slices(uvo, tried_to_reorder=True)
443 else:
444 raise NotImplementedError('UVData objects with non-regular spacing of '
445 'baselines in its baseline-times are not supported.')
446 else:
447 41478120 271685967.0 6.6 5.7 blt_slices[(ant1, ant2)] = slice(indices[0], indices[-1] + 1, indices[1] - indices[0])
448 4799 768416.0 160.1 0.0 return blt_slices
A lot of the time is taken up with finding the indices for each antpair. I get that this is sometimes necessary, because in general a UVData can have blt's in any order. But in fact for HERA data it is unnecessary because blt's always go time-first, antenna-second. If we can find a way to quickly determine (or maybe allow "assuming") that we can use this info, it would be a significant speed up.
The text was updated successfully, but these errors were encountered:
While profiling the LST-binner over 40 days of H6C data, I found that out of the ~20k total seconds taken, almost 5k (1.5 hours!) were spent in the
get_blt_slices
function. This seems somewhat unnecessary. For reference, here's the output from line-profiler for the function:A lot of the time is taken up with finding the indices for each antpair. I get that this is sometimes necessary, because in general a UVData can have blt's in any order. But in fact for HERA data it is unnecessary because blt's always go time-first, antenna-second. If we can find a way to quickly determine (or maybe allow "assuming") that we can use this info, it would be a significant speed up.
The text was updated successfully, but these errors were encountered: