We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone.
Recent articles
- Two publishers and three authors fail to understand what "vibe coding" means - 1st May 2025
- Understanding the recent criticism of the Chatbot Arena - 30th April 2025
- Qwen 3 offers a case study in how to effectively release a model - 29th April 2025