Multipl-E HumanEval
language official site →
MultiPL-E is a scalable and extensible approach to benchmarking neural code generation that translates unit test-driven code generation benchmarks across multiple programming languages. It extends the HumanEval benchmark to 18 additional programming languages, enabling evaluation of code generation models across diverse programming paradigms and providing insights into how models generalize programming knowledge across language boundaries.
Methodology
Imported from llm-stats public benchmark metadata. Modality: text. Max score: 1. Categories: general, language. Language: en. Verified by llm-stats: no.