Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new remote seed dataset loader for the HuggingFace lmsys/toxic-chat dataset so it can be discovered/loaded through PyRIT’s dataset provider framework, with accompanying unit tests and docs update.
Changes:
- Introduces
_ToxicChatDatasetremote loader that converts ToxicChat rows intoSeedPrompts and skips prompts that fail template parsing. - Adds unit tests for the ToxicChat loader.
- Updates dataset-loading documentation output to include
toxic_chatin the available dataset list.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
pyrit/datasets/seed_datasets/remote/toxic_chat_dataset.py |
New HuggingFace-backed loader that builds SeedDataset/SeedPrompt entries from ToxicChat rows and skips problematic entries. |
tests/unit/datasets/test_toxic_chat_dataset.py |
New unit tests validating basic loading behavior and custom HF config plumbing. |
pyrit/datasets/seed_datasets/remote/__init__.py |
Exposes/imports the new loader for discovery/registration. |
doc/code/datasets/1_loading_datasets.ipynb |
Updates the displayed list of available datasets to include toxic_chat. |
Comments suppressed due to low confidence (1)
tests/unit/datasets/test_toxic_chat_dataset.py:78
- For consistency with other dataset loader tests (e.g., DarkBench), this test should assert that
split(and optionallycache) are forwarded to_fetch_from_huggingfacein addition todataset_nameandconfig; otherwise regressions in argument plumbing won't be caught.
with patch.object(loader, "_fetch_from_huggingface", return_value=mock_toxic_chat_data) as mock_fetch:
dataset = await loader.fetch_dataset()
assert len(dataset.seeds) == 2
mock_fetch.assert_called_once()
call_kwargs = mock_fetch.call_args.kwargs
assert call_kwargs["dataset_name"] == "custom/toxic-chat"
assert call_kwargs["config"] == "custom_config"
| except Exception: | ||
| skipped += 1 | ||
| logger.debug( | ||
| f"Skipped ToxicChat entry due to template parsing error (e.g. HTML content)" | ||
| ) |
There was a problem hiding this comment.
This loader's key behavior is to "gracefully skip" entries that fail Jinja2 parsing during SeedPrompt construction, but the unit tests don't cover that path. Add a test case with a deliberately Jinja2-invalid user_input and assert that fetch_dataset returns the remaining valid entries (and does not raise).
| for item in data: | ||
| try: | ||
| seed_prompts.append( | ||
| SeedPrompt( | ||
| value=item["user_input"], | ||
| data_type="text", | ||
| dataset_name=self.dataset_name, | ||
| description=description, | ||
| source="https://huggingface.co/datasets/lmsys/toxic-chat", | ||
| authors=authors, | ||
| groups=["UC San Diego"], | ||
| metadata={ | ||
| "toxicity": str(item.get("toxicity", "")), | ||
| "jailbreaking": str(item.get("jailbreaking", "")), | ||
| "human_annotation": str(item.get("human_annotation", "")), | ||
| }, | ||
| ) | ||
| ) | ||
| except Exception: | ||
| skipped += 1 | ||
| logger.debug( | ||
| f"Skipped ToxicChat entry due to template parsing error (e.g. HTML content)" | ||
| ) |
There was a problem hiding this comment.
The except Exception here is overly broad and will silently skip records for non-template-related issues (e.g., dataset schema changes like missing user_input). Consider catching the specific Jinja2 exception(s) raised during SeedPrompt initialization (e.g., TemplateSyntaxError) and re-raising unexpected exceptions; also include the underlying exception (and ideally conv_id) in the debug log via exc_info=True so skips are diagnosable.
b66a27b to
cb93f7f
Compare
Add remote dataset loader for ToxicChat (lmsys/toxic-chat), containing ~10k real user-chatbot conversations from Chatbot Arena annotated for toxicity and jailbreaking attempts. Gracefully skips entries with Jinja2-incompatible content. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
cb93f7f to
c63374c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
tests/unit/datasets/test_toxic_chat_dataset.py:78
- This test configures
split="test"but doesn't assert thatfetch_dataset()passessplitthrough to_fetch_from_huggingface(unlike similar tests for other datasets). Adding an assertion forcall_kwargs["split"](and optionallycache) would improve coverage of the custom-config behavior.
loader = _ToxicChatDataset(
dataset_name="custom/toxic-chat",
config="custom_config",
split="test",
)
with patch.object(loader, "_fetch_from_huggingface", return_value=mock_toxic_chat_data) as mock_fetch:
dataset = await loader.fetch_dataset()
assert len(dataset.seeds) == 2
mock_fetch.assert_called_once()
call_kwargs = mock_fetch.call_args.kwargs
assert call_kwargs["dataset_name"] == "custom/toxic-chat"
assert call_kwargs["config"] == "custom_config"
| "execution_count": 1, | ||
| "id": "69b77122", | ||
| "metadata": { | ||
| "execution": { | ||
| "iopub.execute_input": "2026-03-01T14:54:22.239868Z", | ||
| "iopub.status.busy": "2026-03-01T14:54:22.239679Z", | ||
| "iopub.status.idle": "2026-03-01T14:54:26.784712Z", | ||
| "shell.execute_reply": "2026-03-01T14:54:26.784206Z" | ||
| } | ||
| }, |
There was a problem hiding this comment.
The notebook was executed and now includes run-specific metadata (execution counts and per-cell execution timestamps). Per repo convention for docs notebooks, please clear outputs and reset execution_count to null (and remove execution timestamp metadata) before committing to avoid noisy diffs.
| "Found default environment files: ['C:\\\\Users\\\\romanlutz\\\\.pyrit\\\\.env', 'C:\\\\Users\\\\romanlutz\\\\.pyrit\\\\.env.local']\n", | ||
| "Loaded environment file: C:\\Users\\romanlutz\\.pyrit\\.env\n", | ||
| "Loaded environment file: C:\\Users\\romanlutz\\.pyrit\\.env.local\n" |
There was a problem hiding this comment.
This committed output includes local machine/user-specific paths (e.g., C:\\Users\\romanlutz\\...) and environment file names, which shouldn't be checked into docs. Please clear cell outputs (and associated metadata) to avoid leaking local information.
| cache=cache, | ||
| ) | ||
|
|
||
| authors = ["Zi Lin", "Zihan Wang", "Yongqi Tong", "Yangkun Wang", "Yuxin Guo", "Yujia Wang", "Jingbo Shang"] |
There was a problem hiding this comment.
Line exceeds the project's max line length (Ruff E501, 120 chars). Please reformat the authors = [...] list across multiple lines so linting passes.
| authors = ["Zi Lin", "Zihan Wang", "Yongqi Tong", "Yangkun Wang", "Yuxin Guo", "Yujia Wang", "Jingbo Shang"] | |
| authors = [ | |
| "Zi Lin", | |
| "Zihan Wang", | |
| "Yongqi Tong", | |
| "Yangkun Wang", | |
| "Yuxin Guo", | |
| "Yujia Wang", | |
| "Jingbo Shang", | |
| ] |
Add remote dataset loader for ToxicChat (lmsys/toxic-chat), containing ~10k real user-chatbot conversations from Chatbot Arena annotated for toxicity and jailbreaking attempts. Gracefully skips entries with Jinja2-incompatible content.