Skip to content

Change how to run multiple domains#100

Open
JosephMarinier wants to merge 4 commits intomainfrom
joseph/revert-multi-domains
Open

Change how to run multiple domains#100
JosephMarinier wants to merge 4 commits intomainfrom
joseph/revert-multi-domains

Conversation

@JosephMarinier
Copy link
Copy Markdown
Collaborator

@JosephMarinier JosephMarinier commented Apr 29, 2026

The current implementation for running multiple domains doesn't scale to running multiple perturbations, models, or other configurations. Also, that code felt like fighting against Pydantic. So, I suggest:

  1. Revert "Allow running multiple domains" for now, so we can think of a more generalizable and Pydantic-friendly implementation.
  2. Document how to run multiple domains, models, etc. using simple shell loops. Although not a first-class EVA feature, this is quite simple and generalizable.

What do you think?

Running Multiple Configurations

Here is an example of shell loop to sweep over domains, models, or any combination of parameters.
Each iteration is an independent eva run. The loop continues on failure and exits with the last non-zero exit code.

exit_code=0;
for domain in airline itsm medical_hr; do
    for llm in gpt-5-mini gpt-5; do
        eva --domain "$domain" --model.llm "$llm" || exit_code=$?;
    done;
done;
exit $exit_code

💡 If you need a single command, like in Docker, you can wrap the shell script with sh -c '...'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant