Phi 4 Mini
Phi 4 Mini Instruct is a lightweight (3.8B parameters) open model built upon synthetic data and filtered web data, focusing on high-quality reasoning. It supports a 128K token context length and is enhanced for instruction adherence and safety via supervised fine-tuning and direct preference optimization.
Benchmark results
| Benchmark | Score | Tags | Source |
|---|---|---|---|
| ARC-C | 83.7% | self-reported llm-stats | link → |
| Arena Hard | 32.8% | self-reported llm-stats | link → |
| BIG-Bench Hard | 70.4% | self-reported llm-stats | link → |
| BoolQ | 81.2% | self-reported llm-stats | link → |
| GPQA | 25.2% | self-reported llm-stats | link → |
| GSM8k | 88.6% | self-reported llm-stats | link → |
| HellaSwag | 69.1% | self-reported llm-stats | link → |
| MATH | 64.0% | self-reported llm-stats | link → |
| MGSM | 63.9% | self-reported llm-stats | link → |
| MMLU | 67.3% | self-reported llm-stats | link → |
| MMLU-Pro | 52.8% | self-reported llm-stats | link → |
| Multilingual MMLU | 49.3% | self-reported llm-stats | link → |
| OpenBookQA | 79.2% | self-reported llm-stats | link → |
| PIQA | 77.6% | self-reported llm-stats | link → |
| Social IQa | 72.5% | self-reported llm-stats | link → |
| TruthfulQA | 66.4% | self-reported llm-stats | link → |
| Winogrande | 67.0% | self-reported llm-stats | link → |