Add Misogyny Probe & Detector to LMRC Framework by peter216 · Pull Request #1565 · NVIDIA/garak

peter216 · 2026-01-19T04:02:18Z

Overview

Replaces PR #1541

This PR adds support for detecting and testing misogynistic language in LLMs through the Language Model Risk Cards (LMRC) framework.

It uses a Hugging Face model for contextual detection, adds a lightweight keyword indicator detector, and updates the probe/tests to match the behavior.

Update: model replaced following license review

The original contextual detector used MilaNLProc/bert-base-uncased-ear-misogyny, which carries a GPLv3 license incompatible with garak's Apache 2.0 license (flagged by @jmartin-tech; confirmed the authors were also contacted and a compatible license was not achievable).

The detector now uses cardiffnlp/twitter-roberta-base-hate-latest (CC-BY-4.0), maintained by Cardiff NLP and trained on 13 hate-speech datasets including the hatEval "hate against women" corpus and the "Call me sexist, but" dataset.

Known limitation: the cardiffnlp model detects overt hate speech reliably but does not detect subtle misogyny expressed as implicit bias or personal opinion (e.g. "she's too emotional to be a leader" scores NOT-HATE). The original MilaNLProc model had better coverage of those cases. The two alternative models evaluated — annahaz/xlm-roberta-base-misogyny-sexism-indomain-mix-bal (MIT) and NLP-LTU/bertweet-large-sexism-detector (no license file) — both missed the same class of subtle framing. This tradeoff was accepted: cardiffnlp's license is compatible, its download volume (~37k/month) signals broad community validation, and the coverage gap is documented here and in the detector docstring.

One integration test sentence was updated accordingly: "She's too emotional to be a leader." → "You're acting like a girl." The original sentence is preserved in a comment with explanation.

Changes

Files Added

tests/detectors/test_detectors_lmrc.py — Comprehensive detector unit tests
tests/probes/test_probes_lmrc.py — Comprehensive probe unit tests

Files Modified

garak/probes/lmrc.py — Added Misogyny probe class
garak/detectors/lmrc.py — Added Misogyny and MisogynyKeywords detector classes
resources/plugin_cache.json — Added probes.lmrc.Misogyny

Detector + Probe

garak/detectors/lmrc.py
- Misogyny uses HFDetector with cardiffnlp/twitter-roberta-base-hate-latest (CC-BY-4.0), detector_target_class: "HATE"
- Docstring documents the subtle-misogyny coverage gap relative to the original model
- MisogynyKeywords retained as a lightweight keyword indicator detector
garak/probes/lmrc.py
- Updated prompts for the Misogyny probe
- Added extended_detectors = ["lmrc.MisogynyKeywords"]

Tests

Test Documentation
tests/detectors/test_detectors_lmrc.py — targets MisogynyKeywords as the unit-testable indicator detector
tests/langservice/detectors/test_detectors_misogyny.py — HF integration tests gated by storage requirements; test sentence updated with coverage-gap rationale in comments
tests/probes/test_probes_lmrc.py — asserts MisogynyKeywords is included in extended_detectors

Run tests:

python -m pytest tests/detectors/test_detectors_lmrc.py tests/probes/test_probes_lmrc.py -v
python -m pytest tests/langservice/detectors/test_detectors_misogyny.py -v  # requires ~500 MB storage

Compatibility

No breaking changes to the probe name (lmrc.Misogyny)
Model footprint reduced: cardiffnlp (~500 MB) vs. original (~2 GB)

References

peter216 · 2026-01-19T04:04:27Z

I have read the DCO Document and I hereby sign the DCO

peter216 · 2026-03-17T08:07:14Z

@leondz I've been busy sorry for the late post. Any interest or feedback on this one? Thanks!

jmartin-tech · 2026-03-17T14:32:16Z

+    """Contextual misogyny detection via MilaNLProc/bert-base-uncased-ear-misogyny"""
+
+    DEFAULT_PARAMS = HFDetector.DEFAULT_PARAMS | {
+        "detector_model_path": "MilaNLProc/bert-base-uncased-ear-misogyny",


The license for this model (listed as GPL v3) is not compatible with this project.

Understood. On it. Thanks.

We sent a request for license adjustment to the model authors as well to see if this can be addressed from that side as well.

Great! I'll holdoff until we see if they respond.

Unfortunately the model license does not look like it can be made compatible easily, we might need another source.

Signed-off-by: peter216 <peter216@users.noreply.github.com>

peter216 · 2026-04-16T14:54:03Z

Ok, on it. Thanks.

New misogyny detector/probe branch

487fed9

jmartin-tech assigned leondz Jan 19, 2026

jmartin-tech reviewed Mar 17, 2026

View reviewed changes

Merge branch 'main' into detector_misogyny-update

800f082

Signed-off-by: peter216 <peter216@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Misogyny Probe & Detector to LMRC Framework#1565

Add Misogyny Probe & Detector to LMRC Framework#1565
peter216 wants to merge 2 commits intoNVIDIA:mainfrom
peter216:detector_misogyny-update

peter216 commented Jan 19, 2026 •

edited

Loading

Uh oh!

peter216 commented Jan 19, 2026

Uh oh!

peter216 commented Mar 17, 2026

Uh oh!

jmartin-tech Mar 17, 2026

Uh oh!

peter216 Mar 17, 2026

Uh oh!

jmartin-tech Mar 18, 2026

Uh oh!

peter216 Mar 18, 2026

Uh oh!

jmartin-tech Apr 15, 2026

Uh oh!

peter216 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

peter216 commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Update: model replaced following license review

Changes

Files Added

Files Modified

Detector + Probe

Tests

Compatibility

References

Uh oh!

peter216 commented Jan 19, 2026

Uh oh!

peter216 commented Mar 17, 2026

Uh oh!

jmartin-tech Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

peter216 Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

jmartin-tech Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

peter216 Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

jmartin-tech Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

peter216 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

peter216 commented Jan 19, 2026 •

edited

Loading