BBQ
Emerging1papers using it
2026first seen
The 'BBQ' dataset is used to evaluate the vulnerability of large language models (LLMs) to misleading responses influenced by fabricated evidence.
The 'BBQ' dataset is used to evaluate the vulnerability of large language models (LLMs) to misleading responses influenced by fabricated evidence.