FEAT Add SALAD-Bench dataset loader by romanlutz · Pull Request #1425 · Azure/PyRIT

romanlutz · 2026-03-01T14:31:26Z

Add remote dataset loader for SALAD-Bench (walledai/SaladBench), a hierarchical safety benchmark with ~30k prompts organized into 6 domains, 16 tasks, and 65+ categories (ACL 2024).

Copilot

Pull request overview

Adds a new remote seed dataset loader for the SALAD-Bench HuggingFace dataset, making it available through PyRIT’s automatic SeedDatasetProvider discovery and documenting it in the dataset-loading guide.

Changes:

Added _SaladBenchDataset remote loader that fetches SALAD-Bench from HuggingFace and converts rows into SeedPrompts.
Registered the loader for auto-discovery via pyrit.datasets.seed_datasets.remote.__init__.
Added unit tests and updated the “Loading Built-in Datasets” notebook to show the new dataset name.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`pyrit/datasets/seed_datasets/remote/salad_bench_dataset.py`	New HuggingFace-backed loader that maps SALAD-Bench entries into `SeedDataset`/`SeedPrompt`.
`pyrit/datasets/seed_datasets/remote/__init__.py`	Imports/exports `_SaladBenchDataset` so it’s registered and discoverable.
`tests/unit/datasets/test_salad_bench_dataset.py`	Unit tests validating dataset fetching and config passthrough behavior.
`doc/code/datasets/1_loading_datasets.ipynb`	Documentation notebook updated to reflect the new dataset in the available list (but currently includes executed outputs/metadata).

Comments suppressed due to low confidence (1)

pyrit/datasets/seed_datasets/remote/salad_bench_dataset.py:74

The authors list formatting is inconsistent with other remote dataset loaders and is hard to read (and likely exceeds the repo’s 120-char line length). Please format the authors list across multiple lines (one author per line) like other dataset loaders for readability and consistent styling.

            dataset_name=self.hf_dataset_name,
            config=self.config,

Copilot · 2026-03-01T14:35:44Z

doc/code/datasets/1_loading_datasets.ipynb

+   "execution_count": 1,
+   "id": "bf7e4f32",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-03-01T14:32:27.367046Z",
+     "iopub.status.busy": "2026-03-01T14:32:27.366836Z",
+     "iopub.status.idle": "2026-03-01T14:32:31.994319Z",
+     "shell.execute_reply": "2026-03-01T14:32:31.993825Z"
+    }
+   },
   "outputs": [


This notebook now contains executed state (non-null execution_count, rich outputs, and execution timestamps). Please clear outputs and reset execution_count to null before committing so docs remain deterministic and don’t include run-specific noise.

Copilot · 2026-03-01T14:35:45Z

doc/code/datasets/1_loading_datasets.ipynb

+      "Found default environment files: ['C:\\\\Users\\\\romanlutz\\\\.pyrit\\\\.env', 'C:\\\\Users\\\\romanlutz\\\\.pyrit\\\\.env.local']\n",
+      "Loaded environment file: C:\\Users\\romanlutz\\.pyrit\\.env\n",
+      "Loaded environment file: C:\\Users\\romanlutz\\.pyrit\\.env.local\n"


The committed notebook output includes user/machine-specific local paths (e.g., C:\\Users\\... and .env locations). Please remove these outputs/paths (by clearing cell outputs) to avoid leaking local environment details into the repo.

Add remote dataset loader for SALAD-Bench (walledai/SaladBench), a hierarchical safety benchmark with ~30k prompts organized into 6 domains, 16 tasks, and 65+ categories (ACL 2024). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 1, 2026 14:31

Copilot started reviewing on behalf of romanlutz March 1, 2026 14:31 View session

romanlutz force-pushed the romanlutz/add-salad-bench-dataset branch from 95af585 to 99ab63b Compare March 1, 2026 14:32

Copilot AI reviewed Mar 1, 2026

View reviewed changes

Add SALAD-Bench dataset loader

7db0e9c

Add remote dataset loader for SALAD-Bench (walledai/SaladBench), a hierarchical safety benchmark with ~30k prompts organized into 6 domains, 16 tasks, and 65+ categories (ACL 2024). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

romanlutz force-pushed the romanlutz/add-salad-bench-dataset branch from 99ab63b to 7db0e9c Compare March 1, 2026 14:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT Add SALAD-Bench dataset loader#1425

FEAT Add SALAD-Bench dataset loader#1425
romanlutz wants to merge 1 commit intoAzure:mainfrom
romanlutz:romanlutz/add-salad-bench-dataset

romanlutz commented Mar 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 1, 2026

Uh oh!

Copilot AI Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

romanlutz commented Mar 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants