Skip to content

Add a script that launches a benchmark from a yaml file or a set of parameters #546

@R-Palazzo

Description

@R-Palazzo

Problem Description

Once #545 is resolved, benchmark configurations will live in YAML files and can be loaded/validated/executed via BenchmarkConfig/BenchmarkLauncher. The next step is to add a script that automatically executes a benchmark from a config yaml file or some pre-defined parameters.

Expected behavior

Provide a simple way to execute benchmarks from a set of parameters

invoke benchmark-launcher \
  --modality single_table \
  --timeout 345600 \
  --datasets 'adult,alarm,census' \
  --synthesizers 'UniformSynthesizer,TVAESynthesizer' \
  --num_instances 4 \
  --output_destination 's3://sdgym-benchmark/Debug'

Or from a config file

invoke benchmark-launcher --config_filepath 'config.yaml'

Additional context

  • It should not be possible to give a config_filepath and parameters together.
  • If num_instances is given, then this exact number of instances must be launched.
    • For now we can apply a simple rule to build the instance_jobs split, which is:
      • Try to split by synthesizer first and keep all datasets together
      • If we still need more splits to reach the number of instances, split the dataset list for a given synthesizer
      • This logic ensures the correct number of instances is launched without job redundancy.

Metadata

Metadata

Assignees

Labels

feature requestRequest for a new featureinternalThe issue doesn't change the API or functionality

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions