Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Struct filter by index #18778

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

barak1412
Copy link
Contributor

Fixes #18732.

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Sep 16, 2024
@barak1412 barak1412 changed the title fix: Unimplemented struct filter by index fix: Implemented struct filter by index Sep 16, 2024
@barak1412 barak1412 changed the title fix: Implemented struct filter by index fix: Implement struct filter by index Sep 16, 2024
@cmdlineluser
Copy link
Contributor

Thanks @barak1412

I think negative indexing is normally allowed:

>>> df.with_columns(a = pl.col.foo.struct[-1])
# shape: (2, 2)
# ┌───────────┬─────┐
# │ foo       ┆ a   │
# │ ---       ┆ --- │
# │ struct[1] ┆ i64 │
# ╞═══════════╪═════╡
# │ {1}       ┆ 1   │
# │ {2}       ┆ 2   │
# └───────────┴─────┘

The logic for that seems to be here:

FieldByIndex(index) => mapper.try_map_field(|field| {
let (index, _) = slice_offsets(*index, 0, mapper.get_fields_lens());

Copy link

codecov bot commented Sep 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.86%. Comparing base (6561eba) to head (28658bb).
Report is 17 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #18778      +/-   ##
==========================================
+ Coverage   79.84%   79.86%   +0.02%     
==========================================
  Files        1518     1518              
  Lines      205576   205637      +61     
  Branches     2892     2893       +1     
==========================================
+ Hits       164132   164238     +106     
+ Misses      40896    40851      -45     
  Partials      548      548              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@barak1412
Copy link
Contributor Author

barak1412 commented Sep 16, 2024

@cmdlineluser Sure I will look into it.

@barak1412
Copy link
Contributor Author

@cmdlineluser Fixed

@ritchie46
Copy link
Member

I appreciate the fix, but this isn't how it should be fixed. The expression should be replaced by a field by name, hence the panic. We should never access data by index as that doesn't allow us to remove fields we don't need.

@barak1412
Copy link
Contributor Author

barak1412 commented Sep 17, 2024

@ritchie46 Sure, I am glad to learn

So the right way is given index, fetch the right field and then use its name to fetch the data in the expression level?

@ritchie46
Copy link
Member

ritchie46 commented Sep 17, 2024

So the right way is given index, fetch the right field and then use its name to fetch the data in the expression level?

Yes, but we already have that logic, it seems that we don't hit that in the filter

Note that select works as expected:

df = pl.DataFrame({"foo": [{"a":1},{"a":2}]})
df.select(pl.col.foo.struct[0] == 1)

I think you should look into DSL to IR conversion and look what we do differently in Select vs Filter.

@barak1412 barak1412 changed the title fix: Implement struct filter by index fix: Struct filter by index Sep 18, 2024
@barak1412
Copy link
Contributor Author

@ritchie46

Thanks for the guidance, should be the right fix now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.struct[idx] inside df.filter() PanicException / not implemented
3 participants