object_id computed incorrectly when duration set to single value #384

adelavega · 2020-03-04T21:51:25Z

ExtractorResult.duration can be set to a single value (e.g. 0.1) when this value represents durations for all data points. This should be broadcast to all values, and is when creating df.

However, when computing object_id this is not done, and it causes the index that is sued to groupby the data to be a single value, followed by all NAs.

https://github.com/tyarkoni/pliers/blob/7e92ad50b3996a0df1f3dfcc91c1f006597b5728/pliers/extractors/base.py#L154

This results in object_ids like so:

0, 0, 1, 2, 3, 4, 5, 6

When it should be all 0s.

Broadcasting prior to making the pd.Series would fix this issue.

The text was updated successfully, but these errors were encountered:

adelavega · 2020-03-17T21:37:44Z

@rbroc do you remember if we fixed this? I feel like we did but I'm not seeing a PR for it

rbroc · 2020-03-19T11:01:38Z

@rbroc do you remember if we fixed this? I feel like we did but I'm not seeing a PR for it

We didn't. I fixed it within the AudiosetLabelExtractor (which is where we observed the anomaly) by forcing durations to be a vector of length len(onsets).

I guess the question is whether we want to add explicit broadcasting at https://github.com/tyarkoni/pliers/blob/7e92ad50b3996a0df1f3dfcc91c1f006597b5728/pliers/extractors/base.py#L154

I feel like that might be a bit more convoluted than just sticking to the rule that ExtractorResults should have onset and duration as lists of the same length (with one value per row of the output). So far, the overwhelming majority of extractors do that already.

Rather than changing the way object_id is created, we might just want to add a line checking len(onsets) == len(durations) and triggering an Error if this is not the case.

adelavega added the bug label Mar 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

object_id computed incorrectly when duration set to single value #384

object_id computed incorrectly when duration set to single value #384

adelavega commented Mar 4, 2020

adelavega commented Mar 17, 2020

rbroc commented Mar 19, 2020

object_id computed incorrectly when duration set to single value #384

object_id computed incorrectly when duration set to single value #384

Comments

adelavega commented Mar 4, 2020

adelavega commented Mar 17, 2020

rbroc commented Mar 19, 2020