RMCBench

Emerging

3papers using it

2024first seen

RMCBench is a benchmark consisting of 473 prompts used to evaluate the ability of Large Language Models to resist the generation of malicious code through text-to-code and code-to-code scenarios.

🔎 Find this dataset

Papers using RMCBench (3)

Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models2025

Smoke and Mirrors: Jailbreaking LLM-based Code Generation via Implicit Malicious Prompts2025

RMCBench: Benchmarking Large Language Models' Resistance to Malicious Code2024