Skip to content

Move the current benchmark configs to yaml files#566

Open
R-Palazzo wants to merge 16 commits intofeature/benchmark_launcherfrom
issue-532-define-yaml-files
Open

Move the current benchmark configs to yaml files#566
R-Palazzo wants to merge 16 commits intofeature/benchmark_launcherfrom
issue-532-define-yaml-files

Conversation

@R-Palazzo
Copy link
Collaborator

@R-Palazzo R-Palazzo commented Mar 5, 2026

Resolve #545
CU-86b8h52bh

It's a pretty big PR, thanks in advance for your review :)

Currently the BenchmarkLauncher only works for GCP.
To make it work on other compute servie (e.g. AWS) we will have to define benchmark methods that have similar parameter as the gcp ones (with credentials/compute_config):

def _benchmark_compute_gcp(

@R-Palazzo R-Palazzo self-assigned this Mar 5, 2026
@sdv-team
Copy link
Contributor

sdv-team commented Mar 5, 2026

@R-Palazzo R-Palazzo changed the title Issue 532 define yaml files Move the current benchmark configs to yaml files Mar 5, 2026
@R-Palazzo R-Palazzo marked this pull request as ready for review March 5, 2026 18:09
@R-Palazzo R-Palazzo requested a review from a team as a code owner March 5, 2026 18:09
compute_privacy_score: false

compute:
service: 'gcp'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR, we only indicate which compute service to use. In a future issue, we will move the compute config defined here

Inside this yaml file also

)


class BenchmarkLauncher:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if the naming works.
I think this class could be named Benchmark also, since it does more than launching, as one will be able to manage and terminate instances from it.

Let me know your thoughts on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that given the file name and context it makes sense, it is already defined in the issue and I think is good. An alternative, if you are looking for, could be BenchmarkManager or BenchmarkRunner.

@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

❌ Patch coverage is 95.28620% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.47%. Comparing base (0d8b7f8) to head (133a551).

Files with missing lines Patch % Lines
sdgym/_benchmark_launcher/benchmark_launcher.py 78.78% 7 Missing ⚠️
sdgym/_benchmark_launcher/_validation.py 96.06% 5 Missing ⚠️
sdgym/_benchmark_launcher/benchmark_config.py 98.21% 1 Missing ⚠️
sdgym/_benchmark_launcher/utils.py 98.43% 1 Missing ⚠️
Additional details and impacted files
@@                      Coverage Diff                       @@
##           feature/benchmark_launcher     #566      +/-   ##
==============================================================
+ Coverage                       82.41%   83.47%   +1.05%     
==============================================================
  Files                              33       38       +5     
  Lines                            2923     3195     +272     
==============================================================
+ Hits                             2409     2667     +258     
- Misses                            514      528      +14     
Flag Coverage Δ
integration 49.42% <0.33%> (-4.57%) ⬇️
unit 78.59% <95.28%> (+1.51%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@pvk-developer pvk-developer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking very good. I would suggest going over the _validation.py and try to modularize it a little bit if you have the time since it is hard to navigate the long functions in there.

)


class BenchmarkLauncher:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that given the file name and context it makes sense, it is already defined in the issue and I think is good. An alternative, if you are looking for, could be BenchmarkManager or BenchmarkRunner.

@R-Palazzo R-Palazzo force-pushed the issue-532-define-yaml-files branch from abe9001 to f670ba5 Compare March 10, 2026 10:03
@R-Palazzo R-Palazzo requested a review from pvk-developer March 11, 2026 12:41
@@ -0,0 +1,21 @@
method_params:
timeout: 345600
output_destination: 's3://sdgym-benchmark/Debug/Benchmark_Launcher/'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Update before merging

)
"
python -m pip install "sdgym[all]"
python -m pip install "sdgym[all] @ git+https://github.com/sdv-dev/SDGym.git@issue-532-define-yaml-files"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Revert before merging

Copy link
Contributor

@amontanez24 amontanez24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should restructure the credentials a bit. Lets define a format for the dict like:

gcp:
    GCP_SERVICE_ACCOUNT_JSON: ...
    GCP_SERVICE_ACCOUNT_PATH: ...
    GCP_PROJECT_ID: ...
    GCP_ZONE: ...
sdv_enterprise:
    SDV_ENTERPRISE_USERNAME: ...
    SDV_ENTERPRISE_LICENSE_KEY: ...
aws:
    AWS_ACCESS_KEY_ID: ...
    AWS_SECRET_ACCESS_KEY: ...

We then load this dict and check for expected keys. If they key isn't present we try to load from the environment. So if someone is running with GCP, we would attempt to load all the expected GCP keys from the dict and then check the environment if the dict doesn't have it.

If you want to make the names less redundant, you can remove the prefixes ('SDV', 'AWS', 'GCP') and say that if you are using environment variables you should store it as {service_name}_{key_name}.

compute:
service: 'gcp'

credential_locations:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it called credential_locations?

Comment on lines +12 to +21
gcp:
service_account_json_env: 'GCP_SERVICE_ACCOUNT_JSON'
project_id_env: 'GCP_PROJECT_ID'
zone_env: 'GCP_ZONE'
sdv_enterprise:
username_env: 'SDV_ENTERPRISE_USERNAME'
license_key_env: 'SDV_ENTERPRISE_LICENSE_KEY'
aws:
access_key_id_env: 'AWS_ACCESS_KEY_ID'
secret_access_key_env: 'AWS_SECRET_ACCESS_KEY'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should either pass the credentials here, or just set them as env variables with the same name as the key. For example

  gcp:
    GCP_SERVICE_ACCOUNT_JSON: ...,
    GCP_PROJECT_ID: sdgym-123
    GCP_ZONE: ...

When loading credentials we can either load from the config file, or if the key isn't present, load from the environment. I think that is a more common pattern

def _get_credentials(credential_locations):
"""Get resolved credentials dict."""
config = credential_locations or {}
filepath = config.get('credential_filepath')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the case where the filepath is provided, is the structure of the credentials dict the same as credential_locations?

Comment on lines +169 to +183
sig = inspect.signature(method_to_run)
required = {
parameter.name
for parameter in sig.parameters.values()
if parameter.default is inspect.Parameter.empty
and parameter.kind
in (inspect.Parameter.POSITIONAL_OR_KEYWORD, inspect.Parameter.KEYWORD_ONLY)
}
required_from_yaml = required - _INJECTED_PARAMS
missing = required_from_yaml - set(method_params)
if missing:
errors.append(
f'method_params: missing required parameters for {method_to_run.__name__}:'
f' {sorted(missing)}'
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this necessary? Can't we just have defaults for whatever they don't provide? The only one that I can see as required is the output destination, and I don't think that one should be grouped with the other parameters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it in 133a551.

'method_params': dict of parameters to pass to the benchmark method (e.g. timeout),
'credentials': dict specifying how to resolve credentials (e.g. from env vars or a file),
'compute': dict specifying the compute configuration (e.g. service: 'gcp'),
'instance_jobs': list of dicts, each specifying a combination of synthesizers and datasets:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the output destination should be specified with the jobs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking it makes more sense to have all the results for a benchmark in the same location, but you’re right we could also set one output_destination per instance_job. Both options work for me. Let me know which we prefer.


def _get_credentials(credential_locations):
"""Get resolved credentials dict."""
config = credential_locations or {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why the credentials locations can have the filepath or the env variable names. I think that at the point of validation, we should just have one credentials dict with all the actual values in it ready to be evaluated

Before then, I think we have the user either provide the filepath or dict itself. The dict shouldn't have the filepath in it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and I reviewed the logic here.

The YAML file now uses the credential_filepath key to point to a JSON file containing the resolved credentials, meaning the file already includes the actual credential values in the expected final dict structure. If credential_filepath is not provided, or if some credential values are missing, we build or complete the resolved credentials dict from environment variables instead.

As you pointed out, GCP is a special case. For GCP, we first look for a service account JSON file, and if it is not available, we fall back to reading the individual credential values from environment variables.

Once the resolved credentials dict is obtained, we validate it. Let me know if that was the idea here.

@R-Palazzo R-Palazzo requested a review from amontanez24 March 12, 2026 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants