Granite 3.3 8B Base

Granite-3.3-8B-Base is a decoder-only language model with a 128K token context window. It improves upon Granite-3.1-8B-Base by adding support for Fill-in-the-Middle (FIM) using specialized tokens, enabling the model to generate content conditioned on both prefix and suffix. This makes it well-suited for code completion tasks

Benchmark results

Benchmark Score Tags Source
AGIEval 49.3% self-reported llm-stats link →
AIME 2024 81.2% self-reported llm-stats link →
AlpacaEval 2.0 62.7% self-reported llm-stats link →
ARC-C 50.8% self-reported llm-stats link →
Arena Hard 57.6% self-reported llm-stats link →
AttaQ 88.5% self-reported llm-stats link →
BIG-Bench Hard 69.1% self-reported llm-stats link →
DROP 36.1% self-reported llm-stats link →
GSM8k 59.0% self-reported llm-stats link →
HellaSwag 80.1% self-reported llm-stats link →
HumanEval 89.7% self-reported llm-stats link →
HumanEval+ 86.1% self-reported llm-stats link →
IFEval 74.8% self-reported llm-stats link →
MATH-500 69.0% self-reported llm-stats link →
MMLU 63.9% self-reported llm-stats link →
NQ 36.5% self-reported llm-stats link →
PopQA 26.2% self-reported llm-stats link →
TriviaQA 78.2% self-reported llm-stats link →
TruthfulQA 52.1% self-reported llm-stats link →
Winogrande 74.4% self-reported llm-stats link →