Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing custom task for WebArena #66

Open
xhluca opened this issue Jun 10, 2024 · 0 comments
Open

Allowing custom task for WebArena #66

xhluca opened this issue Jun 10, 2024 · 0 comments

Comments

@xhluca
Copy link
Collaborator

xhluca commented Jun 10, 2024

Right now, it allows passing an id during __init__:

def __init__(
self,
seed: int,
task_id: Optional[int] = None,
intent_template_id: Optional[int] = None,
with_na_hint: bool = False,
with_homepage_hint: bool = False,
) -> None:
super().__init__(seed)
# task properties, will be used to set up the browsergym environment
self.viewport = {"width": 1280, "height": 720}
self.slow_mo = 1000 # ms
self.timeout = 10000 # ms
self.webarena_instance = WebArenaInstance()
self.config_file: str = None
self.with_na_hint = with_na_hint
self.with_homepage_hint = with_homepage_hint
# one and only one of task id and template id must be provided
if (task_id is None) == (intent_template_id is None):
raise ValueError(
f"One and only one of 'task_id' and 'intent_template_id' must be provided (task_id={task_id}, intent_template_id={intent_template_id})."
)
# read the list of all webarena task configs
import webarena
all_configs_str = importlib.resources.files(webarena).joinpath("test.raw.json").read_text()
# substitute URLs
for pattern, url_key in {
"__GITLAB__": "gitlab",
"__REDDIT__": "reddit",
"__SHOPPING__": "shopping",
"__SHOPPING_ADMIN__": "shopping_admin",
"__WIKIPEDIA__": "wikipedia",
"__MAP__": "map",
}.items():
all_configs_str = all_configs_str.replace(pattern, self.webarena_instance.urls[url_key])
# load all task configs to JSON
all_configs = json.loads(all_configs_str)
# keep only the desired task configs
if intent_template_id is not None:
task_configs = [
conf for conf in all_configs if conf["intent_template_id"] == intent_template_id
]
if not task_configs:
raise ValueError(
f"Could not find any task config with intent_template_id={intent_template_id}."
)
elif task_id is not None:
task_configs = [conf for conf in all_configs if conf["task_id"] == task_id]
if not task_configs:
raise ValueError(
f"Could not find any task config with task_id={intent_template_id}."
)
self.task_configs = task_configs

We could add another parameter task_configs which is of type List[Union[dict, str]], for example:

GenericWebArenaTask(task_configs=[{...}, {...}], ...)

where each dict would be something like this one, but customized: https://github.com/web-arena-x/webarena/blob/main/config_files/examples/2.json

Behind the scene, if task_configs dictionary/json-compatible string was passed, then the setup would automatically switch to this custom task instead of original test splits:

def setup(self, page: playwright.sync_api.Page) -> tuple[str, dict]:
# import webarena on instanciation
from webarena.evaluation_harness.evaluators import evaluator_router
# pick a task at random
self.config = self.random.choice(self.task_configs)
# hack: dynamically build a config file to read from
with tempfile.NamedTemporaryFile(mode="w+", delete=False) as f:
json.dump(self.config, f)
f.flush()
self.config_file = f.name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant