The largest model in the PaLM 2 family, PaLM 2-L, is significantly smaller than the largest PaLM model but uses more training compute. Our evaluation results show that PaLM 2 models significantly outperform PaLM on a variety of tasks, including natural language generation, translation, and reasoning. These results suggest that model scaling is not the only way to improve performance. Instead, performance can be unlocked by meticulous data selection and efficient architecture/objectives. Moreover, a smaller but higher quality model significantly improves inference efficiency, reduces serving cost, and enables the model’s downstream application for more applications and users.
— PaLM 2 Technical Report, PDF
Recent articles
- A selfish personal argument for releasing code as Open Source - 24th January 2025
- Anthropic's new Citations API - 24th January 2025
- Six short video demos of LLM and Datasette projects - 22nd January 2025