We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone.
Recent articles
- 2025: The year in LLMs - 31st December 2025
- How Rob Pike got spammed with an AI slop "act of kindness" - 26th December 2025
- A new way to extract detailed transcripts from Claude Code - 25th December 2025