-
Notifications
You must be signed in to change notification settings - Fork 117
fix: hallucination detection tests #1006
Copy link
Copy link
Labels
area/backendsProvider-specific work: Ollama, HF, LiteLLM, OpenAI, Bedrock, vLLMProvider-specific work: Ollama, HF, LiteLLM, OpenAI, Bedrock, vLLMarea/intrinsicsGranite intrinsic adapters: RAG, Guardian, CoreGranite intrinsic adapters: RAG, Guardian, Corearea/stdlibCore abstractions: Context, MOT, SamplingStrategy, formatters, serializationCore abstractions: Context, MOT, SamplingStrategy, formatters, serializationbugSomething isn't workingSomething isn't working
Metadata
Metadata
Assignees
Labels
area/backendsProvider-specific work: Ollama, HF, LiteLLM, OpenAI, Bedrock, vLLMProvider-specific work: Ollama, HF, LiteLLM, OpenAI, Bedrock, vLLMarea/intrinsicsGranite intrinsic adapters: RAG, Guardian, CoreGranite intrinsic adapters: RAG, Guardian, Corearea/stdlibCore abstractions: Context, MOT, SamplingStrategy, formatters, serializationCore abstractions: Context, MOT, SamplingStrategy, formatters, serializationbugSomething isn't workingSomething isn't working
Type
Fields
Give feedbackNo fields configured for Task.
Our hallucination detection tests in
test_rag.pyare failing. I believe this is because the checks in those tests have fallen out of sync with those in the formatter tests. We should fix them. This might also require loosening the expectations for the output like we've done with citations.example failure: