WMDP

safety official site →

Weapons of Mass Destruction (WMDP) is a multiple-choice benchmark on dual-use biology, chemistry, and cyber knowledge. It measures a model's capacity to enable malicious actors to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons.

Methodology

Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: biology, chemistry, healthcare, safety. Language: en. Verified by llm-stats: no.

Leaderboard

  1. Grok-4.1 Thinking self-reported llm-stats
    84.0%