Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically Set Up Python Environments for R Packages #1671

Open
t-kalinowski opened this issue Sep 17, 2024 · 0 comments
Open

Automatically Set Up Python Environments for R Packages #1671

t-kalinowski opened this issue Sep 17, 2024 · 0 comments

Comments

@t-kalinowski
Copy link
Member

t-kalinowski commented Sep 17, 2024

There is a persistent need for R packages that depend on reticulate to automatically set up a Python environment. While reticulate::configure_environment() aimed to enable this, it had design decisions that caused issues in practice and is now soft-deprecated. However, the need still exists, and reticulate should aim to make this possible. There are several approaches to achieve this, possibly by modifying reticulate::configure_environment() or designing a new approach.

Issues with reticulate::configure_environment():

  1. Conda as the default environment: Using Conda by default often caused binary incompatibilities for R users. For example, this issue is consistently the most visited via Google searches. New solutions should default to virtual environments instead.

  2. Installing into a shared environment: Automatically installing all package dependencies into a shared Python environment caused conflicts when two R packages required incompatible Python packages. This led to a frustrating failure loop where each triggered installation could break a previously functioning setup. Additionally, no output was shown to indicate what commands were executed or how to disable the "automagic" behavior. Users had to discover the RETICULATE_AUTOCONFIGURE=FALSE environment variable on their own.

Proposed solutions:

  1. Isolated environments: R packages should install dependencies into a stand-alone Python environment specific to that R package. To combine the dependencies of multiple packages, users could opt-in with a function like configure_environment(packages = c("pkg1", "pkg2")).

  2. More visible output: Informative messages should clearly indicate which commands are being run, helping users diagnose issues.

  3. Automatic installation messages: Whenever an automatic install is triggered, a message should inform users how to disable this behavior, e.g., "You can disable automatic install by setting Sys.setenv(RETICULATE_AUTOCONFIGURE=FALSE) before loading package foo."

  4. Selective automatic installation: Only trigger automatic installation if the newly installed environment would actually be used in the current R session. For instance, do not trigger an installation if any of conditions 1-7 in Reticulate’s order of Python discovery are true.

  5. Dynamic Python package dependencies and post-install steps: The current implementation of configure_environment() requires Python package names to be provided as a static list, which proved too rigid in practice. For example, the TensorFlow R package requires different Python packages depending on the platform (tensorflow, tensorflow-gpu, or tensorflow-metal on macOS). Additionally, after installing the Python package, TensorFlow needs to set up symlinks for GPU functionality (as described here), which the current configure_environment() does not support.

Ideal solution:

An ideal implementation could specify this setup in an R package’s DESCRIPTION file, with flexible syntax options like:

Package: tensorflow
Type: Package
...
Config/reticulate/r-command: install_tensorflow(envname = "r-tensorflow")

or

Config/reticulate/pip-requirements: tensorflow==2.16.*, keras>=3

This would allow for dynamic post-install actions, like configuring symlinks for GPU support, and future syntax options could be added based on community feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant