FEAT Add OR-Bench dataset loader by romanlutz · Pull Request #1423 · Azure/PyRIT

romanlutz · 2026-03-01T14:25:35Z

Add remote dataset loader for OR-Bench (bench-llm/OR-Bench), an over-refusal benchmark that tests whether language models wrongly refuse safe prompts. Supports both or-bench-hard-1k and or-bench-toxic configurations.

Add remote dataset loader for OR-Bench (bench-llm/OR-Bench), an over-refusal benchmark that tests whether language models wrongly refuse safe prompts. Supports both or-bench-hard-1k and or-bench-toxic configurations. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds a new remote dataset loader for the HuggingFace OR-Bench benchmark (bench-llm/OR-Bench) so it can be discovered via SeedDatasetProvider and loaded as SeedDataset seeds, with support for both the or-bench-hard-1k and or-bench-toxic configurations.

Changes:

Introduces _ORBenchDataset remote loader that fetches OR-Bench from HuggingFace and converts rows into SeedPrompts.
Registers the new loader for automatic discovery and documents the new dataset name in the datasets loading notebook output.
Adds unit tests covering default loading and the toxic config path.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
`pyrit/datasets/seed_datasets/remote/or_bench_dataset.py`	Implements the OR-Bench HuggingFace-backed dataset loader and maps records into `SeedPrompt`s.
`pyrit/datasets/seed_datasets/remote/__init__.py`	Imports/exports `_ORBenchDataset` to trigger provider registration and expose it from the remote loaders package.
`tests/unit/datasets/test_or_bench_dataset.py`	Adds unit tests validating prompt mapping and config propagation to the HuggingFace fetch helper.
`doc/code/datasets/1_loading_datasets.ipynb`	Updates the displayed list of available datasets to include `or_bench`.

Copilot AI review requested due to automatic review settings March 1, 2026 14:25

Copilot started reviewing on behalf of romanlutz March 1, 2026 14:26 View session

romanlutz force-pushed the romanlutz/add-or-bench-dataset branch from fea917e to 5b341d2 Compare March 1, 2026 14:26

Copilot AI reviewed Mar 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT Add OR-Bench dataset loader#1423

FEAT Add OR-Bench dataset loader#1423
romanlutz wants to merge 1 commit intoAzure:mainfrom
romanlutz:romanlutz/add-or-bench-dataset

romanlutz commented Mar 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

romanlutz commented Mar 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants