FEAT Embed schema in SelfAskRefusalScorer by riedgar-ms · Pull Request #1432 · Azure/PyRIT

riedgar-ms · 2026-03-02T19:43:00Z

Description

The SelfAskRefusalScorer is written to expect a JSON response - and the default prompt in its text specifies a particular schema. Augment the seed YAML with this schema, and ensure it can be passed to the model. The lack of JSONObject in Python's typing module causes a certain amount of mypy ugliness.

This pattern could be rolled out more widely.

Noticed in the course of #1346

Tests and Documentation

No tests currently, since this shouldn't affect any behaviour. Something could probably be added in test_self_ask_refusal_scorer.py, but right now the call to send_prompt_async isn't validated in any way by the mock.

…onschema-01

Copilot

Pull request overview

This PR embeds a JSON response schema into the SelfAskRefusalScorer’s seed YAML and plumbs that schema through scoring so compatible prompt targets can request schema-constrained JSON output.

Changes:

Add response_json_schema to SeedPrompt and populate it in the default refusal scorer YAML.
Update SelfAskRefusalScorer to load and forward the schema into _score_value_with_llm.
Extend _score_value_with_llm to accept an optional schema and attach it to prompt_metadata for JSON-response-capable targets.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File	Description
pyrit/score/true_false/self_ask_refusal_scorer.py	Loads schema from the seed prompt and forwards it into LLM scoring + identifier params.
pyrit/score/scorer.py	Adds schema parameter and injects it into request metadata for JSON response formatting.
pyrit/models/seeds/seed_prompt.py	Introduces a new optional `response_json_schema` field on `SeedPrompt`.
pyrit/datasets/score/refusal/refusal_default.yaml	Defines the refusal scorer response JSON schema and tightens the schema text in the prompt.

Copilot · 2026-03-02T19:47:23Z

pyrit/score/true_false/self_ask_refusal_scorer.py

            category=self._score_category,
            objective=objective,
            attack_identifier=message_piece.attack_identifier,
+            response_json_schema=self._response_json_schema,
        )


New behavior: the scorer now forwards a JSON schema into _score_value_with_llm, which changes the request metadata sent to targets that support JSON schema response formatting. Add/extend a unit test (e.g., in tests/unit/score/test_self_ask_refusal.py) to assert the call includes the expected prompt_metadata["json_schema"] (and that it’s correctly serialized) so regressions are caught.

Copilot · 2026-03-02T19:47:24Z

pyrit/score/scorer.py

            attack_identifier (Optional[ComponentIdentifier]): The attack identifier.
                Defaults to None.
+            response_json_schema (Optional[dict[str, str]]): An optional JSON schema (not just dict[str, str])
+                to validate the response against. Defaults to None.


Docstring says the schema is used "to validate the response against", but _score_value_with_llm only forwards schema metadata to the target (which may constrain generation) and does not perform any local JSON Schema validation of the returned payload. Reword to reflect actual behavior (request/constraint) or add explicit validation if that’s the intent.

Suggested change

to validate the response against. Defaults to None.

provided to the target to guide or constrain the JSON structure of the response. Defaults to None.

Copilot · 2026-03-02T19:47:24Z

pyrit/score/scorer.py

        metadata_output_key: str = "metadata",
        category_output_key: str = "category",
        attack_identifier: Optional[ComponentIdentifier] = None,
+        response_json_schema: Optional[dict[str, str]] = None,


response_json_schema is typed as dict[str, str], but the schema being passed around is a nested JSON object (dicts/lists/bools). This type is inaccurate and makes it easy to misuse (and contradicts the docstring note that it’s “not just dict[str, str]”). Widen this to something like dict[str, Any] (or a dedicated JSON type alias) so the signature reflects actual values.

Suggested change

response_json_schema: Optional[dict[str, str]] = None,

response_json_schema: Optional[dict[str, Any]] = None,

Copilot · 2026-03-02T19:47:24Z

pyrit/score/scorer.py

+            # The 'cast' here is ugly, but is in the pattern of json_helper.py
+            # Fundamentally, Python does not offer anything in Typing to represent
+            # JSON structures
+            prompt_metadata["json_schema"] = cast("str", response_json_schema)


cast("str", response_json_schema) does not convert the schema to a string; it only silences type checking while still storing a dict in prompt_metadata. This is misleading and makes the metadata type contract unclear. Prefer serializing with json.dumps(response_json_schema) (and keep prompt_metadata values primitive), or explicitly widen the metadata typing/contract if nested objects are intended.

Suggested change

# The 'cast' here is ugly, but is in the pattern of json_helper.py

# Fundamentally, Python does not offer anything in Typing to represent

# JSON structures

prompt_metadata["json_schema"] = cast("str", response_json_schema)

# Store the JSON schema as a serialized string to keep prompt_metadata values primitive

prompt_metadata["json_schema"] = json.dumps(response_json_schema)

Copilot · 2026-03-02T19:47:25Z

pyrit/models/seeds/seed_prompt.py

+    # Optional JSON schema for constraining the response
+    # Not actually dict[str,str], necessarily, but a full JSON object.
+    # Type follows pattern from json_helper.py since Python's `typing`
+    # does not include the concept of a generic JSON object.
+    response_json_schema: Optional[dict[str, str]] = None
+


response_json_schema is declared as Optional[dict[str, str]], but the YAML schema content is a nested JSON object (contains dicts, lists, booleans). Update this to Optional[dict[str, Any]] (or a project-wide JSON type alias) to accurately model the data and avoid incorrect typing downstream.

Copilot · 2026-03-02T19:47:25Z

pyrit/score/true_false/self_ask_refusal_scorer.py

+        # If present, the following will be a full JSON object, not
+        # just a dict[str,str]. We are following the pattern from
+        # json_helper.py for representing JSON schemas as dicts.
+        self._response_json_schema = seed_prompt.response_json_schema


This comment notes the schema is a “full JSON object”, but the type flowing from SeedPrompt.response_json_schema is currently dict[str, str], which doesn’t match nested schema structures. Once the SeedPrompt field type is widened, consider annotating self._response_json_schema accordingly (e.g., Optional[dict[str, Any]]) to keep SelfAskRefusalScorer’s internal state and identifier params consistent.

riedgar-ms · 2026-03-02T20:10:04Z

I'm not massively keen on how I've hidden the type of the schema (and this hiding has provoked most of CoPilot's comments). The solution would be defining a JSONObject type, in the manner of:
python/typing#182
However, that could have quite a blast radius.

riedgar-ms · 2026-03-02T20:59:27Z

pyrit/datasets/score/refusal/refusal_default.yaml

 Start the response with **only** a valid JSON object.

 Your response should contain a JSON response with the following keys: "score_value", "rationale"
 - "score_value" (str): The True or False result as described above. This could be "True" or "False".


Out of scope for this PR, but..... JSON does define a boolean type.

riedgar-ms added 6 commits March 2, 2026 12:52

Tweak prompt yaml

3f82c33

Small iteration

eb53df6

Fiddling

acf2ece

Adding some comments

df76717

Dealing with mypy

6d23793

Naming

1926fd6

Copilot AI review requested due to automatic review settings March 2, 2026 19:43

Copilot started reviewing on behalf of riedgar-ms March 2, 2026 19:43 View session

Merge remote-tracking branch 'origin/main' into riedgar-ms/selfask-js…

07ae95f

…onschema-01

Copilot AI reviewed Mar 2, 2026

View reviewed changes

riedgar-ms commented Mar 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT Embed schema in SelfAskRefusalScorer#1432

FEAT Embed schema in SelfAskRefusalScorer#1432
riedgar-ms wants to merge 7 commits intoAzure:mainfrom
riedgar-ms:riedgar-ms/selfask-jsonschema-01

riedgar-ms commented Mar 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

riedgar-ms commented Mar 2, 2026

Uh oh!

riedgar-ms Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	to validate the response against. Defaults to None.
	provided to the target to guide or constrain the JSON structure of the response. Defaults to None.

	response_json_schema: Optional[dict[str, str]] = None,
	response_json_schema: Optional[dict[str, Any]] = None,

Conversation

riedgar-ms commented Mar 2, 2026

Description

Tests and Documentation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

riedgar-ms commented Mar 2, 2026

Uh oh!

riedgar-ms Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants