GPT-3.5 Turbo
The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| DROP | 70.2% | self-reported llm-stats | link → |
| GPQA | 30.8% | self-reported llm-stats | link → |
| HumanEval | 68.0% | self-reported llm-stats | link → |
| MATH | 43.1% | self-reported llm-stats | link → |
| MathVista | 0.0% | self-reported llm-stats | link → |
| MGSM | 56.3% | self-reported llm-stats | link → |
| MMLU | 69.8% | self-reported llm-stats | link → |
| MMMU | 0.0% | self-reported llm-stats | link → |