Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ToBackend expression #1115

Merged
merged 5 commits into from
Aug 13, 2024
Merged

Introduce ToBackend expression #1115

merged 5 commits into from
Aug 13, 2024

Conversation

rjzamora
Copy link
Member

Introduces a simple ToBackend (and ToPandasBackend implementation).

The current approach of using map_partitions in PandasBackendEntrypoint effectively blocks query-planning optimizations when data is moved between GPU and CPU. The obvious solution is to use a simple Expr class.

@rjzamora
Copy link
Member Author

@phofl - This PR is (hopefully) much less controversial than the resource barrier idea in #1116 :)

dask_expr/_backends.py Outdated Show resolved Hide resolved
dask_expr/_expr.py Outdated Show resolved Hide resolved
@phofl phofl merged commit 37a5116 into dask:main Aug 13, 2024
6 checks passed
@phofl
Copy link
Collaborator

phofl commented Aug 13, 2024

thx

@rjzamora rjzamora deleted the to-backend-expr branch August 13, 2024 15:19
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this pull request Aug 19, 2024
Adds a `ToCudfBackend` expression for "pandas" to "cudf" conversion, preventing `to_backend("cudf")` operations from blocking useful optimizations like predicate pushdown.

This is the dask-cudf component of dask/dask-expr#1115

Authors:
  - Richard (Rick) Zamora (https://github.com/rjzamora)

Approvers:
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #16573
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants