FEAT Add ToxicChat dataset loader by romanlutz · Pull Request #1422 · Azure/PyRIT

romanlutz · 2026-03-01T14:20:13Z

Add remote dataset loader for ToxicChat (lmsys/toxic-chat), containing ~10k real user-chatbot conversations from Chatbot Arena annotated for toxicity and jailbreaking attempts. Gracefully skips entries with Jinja2-incompatible content.

Copilot

Pull request overview

Adds a new remote seed dataset loader for the HuggingFace lmsys/toxic-chat dataset so it can be discovered/loaded through PyRIT’s dataset provider framework, with accompanying unit tests and docs update.

Changes:

Introduces _ToxicChatDataset remote loader that converts ToxicChat rows into SeedPrompts and skips prompts that fail template parsing.
Adds unit tests for the ToxicChat loader.
Updates dataset-loading documentation output to include toxic_chat in the available dataset list.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`pyrit/datasets/seed_datasets/remote/toxic_chat_dataset.py`	New HuggingFace-backed loader that builds `SeedDataset`/`SeedPrompt` entries from ToxicChat rows and skips problematic entries.
`tests/unit/datasets/test_toxic_chat_dataset.py`	New unit tests validating basic loading behavior and custom HF config plumbing.
`pyrit/datasets/seed_datasets/remote/__init__.py`	Exposes/imports the new loader for discovery/registration.
`doc/code/datasets/1_loading_datasets.ipynb`	Updates the displayed list of available datasets to include `toxic_chat`.

Comments suppressed due to low confidence (1)

tests/unit/datasets/test_toxic_chat_dataset.py:78

For consistency with other dataset loader tests (e.g., DarkBench), this test should assert that split (and optionally cache) are forwarded to _fetch_from_huggingface in addition to dataset_name and config; otherwise regressions in argument plumbing won't be caught.

        with patch.object(loader, "_fetch_from_huggingface", return_value=mock_toxic_chat_data) as mock_fetch:
            dataset = await loader.fetch_dataset()

            assert len(dataset.seeds) == 2
            mock_fetch.assert_called_once()
            call_kwargs = mock_fetch.call_args.kwargs
            assert call_kwargs["dataset_name"] == "custom/toxic-chat"
            assert call_kwargs["config"] == "custom_config"

Copilot · 2026-03-01T14:23:46Z

pyrit/datasets/seed_datasets/remote/toxic_chat_dataset.py

+            except Exception:
+                skipped += 1
+                logger.debug(
+                    f"Skipped ToxicChat entry due to template parsing error (e.g. HTML content)"
+                )


This loader's key behavior is to "gracefully skip" entries that fail Jinja2 parsing during SeedPrompt construction, but the unit tests don't cover that path. Add a test case with a deliberately Jinja2-invalid user_input and assert that fetch_dataset returns the remaining valid entries (and does not raise).

Copilot · 2026-03-01T14:23:46Z

pyrit/datasets/seed_datasets/remote/toxic_chat_dataset.py

+        for item in data:
+            try:
+                seed_prompts.append(
+                    SeedPrompt(
+                        value=item["user_input"],
+                        data_type="text",
+                        dataset_name=self.dataset_name,
+                        description=description,
+                        source="https://huggingface.co/datasets/lmsys/toxic-chat",
+                        authors=authors,
+                        groups=["UC San Diego"],
+                        metadata={
+                            "toxicity": str(item.get("toxicity", "")),
+                            "jailbreaking": str(item.get("jailbreaking", "")),
+                            "human_annotation": str(item.get("human_annotation", "")),
+                        },
+                    )
+                )
+            except Exception:
+                skipped += 1
+                logger.debug(
+                    f"Skipped ToxicChat entry due to template parsing error (e.g. HTML content)"
+                )


The except Exception here is overly broad and will silently skip records for non-template-related issues (e.g., dataset schema changes like missing user_input). Consider catching the specific Jinja2 exception(s) raised during SeedPrompt initialization (e.g., TemplateSyntaxError) and re-raising unexpected exceptions; also include the underlying exception (and ideally conv_id) in the debug log via exc_info=True so skips are diagnosable.

Add remote dataset loader for ToxicChat (lmsys/toxic-chat), containing ~10k real user-chatbot conversations from Chatbot Arena annotated for toxicity and jailbreaking attempts. Gracefully skips entries with Jinja2-incompatible content. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

tests/unit/datasets/test_toxic_chat_dataset.py:78

This test configures split="test" but doesn't assert that fetch_dataset() passes split through to _fetch_from_huggingface (unlike similar tests for other datasets). Adding an assertion for call_kwargs["split"] (and optionally cache) would improve coverage of the custom-config behavior.

        loader = _ToxicChatDataset(
            dataset_name="custom/toxic-chat",
            config="custom_config",
            split="test",
        )

        with patch.object(loader, "_fetch_from_huggingface", return_value=mock_toxic_chat_data) as mock_fetch:
            dataset = await loader.fetch_dataset()

            assert len(dataset.seeds) == 2
            mock_fetch.assert_called_once()
            call_kwargs = mock_fetch.call_args.kwargs
            assert call_kwargs["dataset_name"] == "custom/toxic-chat"
            assert call_kwargs["config"] == "custom_config"

Copilot · 2026-03-01T14:58:57Z

doc/code/datasets/1_loading_datasets.ipynb

+   "execution_count": 1,
+   "id": "69b77122",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-03-01T14:54:22.239868Z",
+     "iopub.status.busy": "2026-03-01T14:54:22.239679Z",
+     "iopub.status.idle": "2026-03-01T14:54:26.784712Z",
+     "shell.execute_reply": "2026-03-01T14:54:26.784206Z"
+    }
+   },


The notebook was executed and now includes run-specific metadata (execution counts and per-cell execution timestamps). Per repo convention for docs notebooks, please clear outputs and reset execution_count to null (and remove execution timestamp metadata) before committing to avoid noisy diffs.

Copilot · 2026-03-01T14:58:57Z

doc/code/datasets/1_loading_datasets.ipynb

+      "Found default environment files: ['C:\\\\Users\\\\romanlutz\\\\.pyrit\\\\.env', 'C:\\\\Users\\\\romanlutz\\\\.pyrit\\\\.env.local']\n",
+      "Loaded environment file: C:\\Users\\romanlutz\\.pyrit\\.env\n",
+      "Loaded environment file: C:\\Users\\romanlutz\\.pyrit\\.env.local\n"


This committed output includes local machine/user-specific paths (e.g., C:\\Users\\romanlutz\\...) and environment file names, which shouldn't be checked into docs. Please clear cell outputs (and associated metadata) to avoid leaking local information.

Copilot · 2026-03-01T14:58:58Z

pyrit/datasets/seed_datasets/remote/toxic_chat_dataset.py

+            cache=cache,
+        )
+
+        authors = ["Zi Lin", "Zihan Wang", "Yongqi Tong", "Yangkun Wang", "Yuxin Guo", "Yujia Wang", "Jingbo Shang"]


Line exceeds the project's max line length (Ruff E501, 120 chars). Please reformat the authors = [...] list across multiple lines so linting passes.

Suggested change

authors = ["Zi Lin", "Zihan Wang", "Yongqi Tong", "Yangkun Wang", "Yuxin Guo", "Yujia Wang", "Jingbo Shang"]

authors = [

"Zi Lin",

"Zihan Wang",

"Yongqi Tong",

"Yangkun Wang",

"Yuxin Guo",

"Yujia Wang",

"Jingbo Shang",

]

Copilot AI review requested due to automatic review settings March 1, 2026 14:20

romanlutz changed the title ~~Add ToxicChat dataset loader~~ FEAT Add ToxicChat dataset loader Mar 1, 2026

Copilot started reviewing on behalf of romanlutz March 1, 2026 14:20 View session

Copilot AI reviewed Mar 1, 2026

View reviewed changes

romanlutz force-pushed the romanlutz/add-toxic-chat-dataset branch from b66a27b to cb93f7f Compare March 1, 2026 14:37

romanlutz force-pushed the romanlutz/add-toxic-chat-dataset branch from cb93f7f to c63374c Compare March 1, 2026 14:54

Copilot AI review requested due to automatic review settings March 1, 2026 14:54

Copilot started reviewing on behalf of romanlutz March 1, 2026 14:55 View session

Copilot AI reviewed Mar 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT Add ToxicChat dataset loader#1422

FEAT Add ToxicChat dataset loader#1422
romanlutz wants to merge 1 commit intoAzure:mainfrom
romanlutz:romanlutz/add-toxic-chat-dataset

romanlutz commented Mar 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 1, 2026

Uh oh!

Copilot AI Mar 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 1, 2026

Uh oh!

Copilot AI Mar 1, 2026

Uh oh!

Copilot AI Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

romanlutz commented Mar 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants