Think before you speak: Training Language Models With Pause Tokens. Another example of how much low hanging fruit remains to be discovered in basic Large Language Model research: this team from Carnegie Mellon and Google Research note that, since LLMs get to run their neural networks once for each token of input and output, inserting “pause” tokens that don’t output anything at all actually gives them extra opportunities to “think” about their output.
Recent articles
- Adding AI-generated descriptions to my tools collection - 13th March 2025
- Notes on Google's Gemma 3 - 12th March 2025
- Here's how I use LLMs to help me write code - 11th March 2025