Artifacts Bench
coding
Artifacts Bench evaluates a model's ability to generate visual code artifacts, measuring the quality of generated interactive and visual front-end outputs from natural-language requests.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: code, frontend_development. Language: en. Verified by llm-stats: no.