Skip to content

Commit

Permalink
Cut labels to fit the audio data
Browse files Browse the repository at this point in the history
Had to cut the labels, probably around two from each string at the end
to fit the audio data better. Really do not know why exactly.

related to: #7
  • Loading branch information
anthonio9 committed Feb 5, 2024
1 parent a204523 commit bfaa0fb
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion penn/data/preprocess/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -274,7 +274,7 @@ def gset():

# FOR sampling rates like 11025, 22050, 44100, resampling isn't necessary
if GSET_SAMPLE_RATE / penn.SAMPLE_RATE % 1 != 0:
printf("Resampling to penn.SAMPLE_RATE")
print("Resampling to penn.SAMPLE_RATE")

pitch_list = np.vsplit(pitch, pitch.shape[0])
pitch_list_final = []
Expand Down Expand Up @@ -316,6 +316,13 @@ def gset():

if voiced.shape[0] == 1:
voiced = voiced[0, :]
else:
overload = np.abs(audio.shape[-1] // penn.HOPSIZE - pitch.shape[-1])
# this is a bad, ugly hack, but well, it is what it is, has to be enabled if resampling isn't enabled
pitch = pitch[..., :-overload]
voiced = voiced[..., :-overload]

assert pitch.shape[-1] == audio.shape[-1] // penn.HOPSIZE

# Save to cache
np.save(output_directory / f'{stem}-pitch.npy', pitch)
Expand Down

0 comments on commit bfaa0fb

Please sign in to comment.