GPT-3.5 Turbo

The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls.

Benchmark results

Benchmark Score Tags Source
DROP 70.2% self-reported llm-stats link →
GPQA 30.8% self-reported llm-stats link →
HumanEval 68.0% self-reported llm-stats link →
MATH 43.1% self-reported llm-stats link →
MathVista 0.0% self-reported llm-stats link →
MGSM 56.3% self-reported llm-stats link →
MMLU 69.8% self-reported llm-stats link →
MMMU 0.0% self-reported llm-stats link →