Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SapientML to automl benchmark #630

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kimusaku
Copy link

SapientML is an AutoML technology that can learn from a corpus of existing datasets and their human-written pipelines, and efficiently generate a high-quality pipeline for a predictive task on a new dataset.

Copy link
Collaborator

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contributions! I haven't had time to try this out yet, but I do already have a couple questions and suggested changes based on the PR. Please have a look at them.


# Sapientml
output_dir = config.output_dir + "/" + "outputs" + "/" + config.name + "/" + str(config.fold)
predictor = SapientML([target_col], task_type="classification" if is_classification else "regression")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the abstract, it seems that there is meta-learning involved. Are there datasets in the meta-learning corpus that are also in the AutoML benchmark? If so, is there a way to avoid "turn off" the inclusion of that data from the meta-model for individual evaluations (e.g., don't use meta-information found on the Santander dataset while evaluating on the Santander dataset?).

Comment on lines +2 to +3
openml
boto3==1.26.98
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tried it yet, but it looks like the exec file does not depend on these dependencies. What are they for?

@@ -0,0 +1,3 @@
sapientml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please install the framework through the setup.sh script. It allows people to specify versions, source, and so on.

@@ -0,0 +1,8 @@
#!/usr/bin/env bash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the script so you can install both from source (as latest) and from pypi (as stable or with a specified version). See for example GAMA's script https://github.com/openml/automlbenchmark/blob/master/frameworks/GAMA/setup.sh

@@ -102,7 +102,7 @@ openml: # configuration namespace for openML.

versions: # configuration namespace for versions enforcement (libraries versions are usually enforced in requirements.txt for the app and for each framework).
pip:
python: 3.9 # the Python minor version that will be used by the application in containers and cloud instances, also used as a based version for virtual environments created for each framework.
python: 3.11 # the Python minor version that will be used by the application in containers and cloud instances, also used as a based version for virtual environments created for each framework.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the framework not 3.9 compatible? Changing this number here will affect all frameworks. While we will raise this over time (and also plan to allow framework-specific definitions for this), we can't currently bump this without ensuring the compatibility for all other frameworks.

@PGijsbers PGijsbers added the framework add For issues with a framework to be added label Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
framework add For issues with a framework to be added
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants