Wednesday, 4th October 2023
Think before you speak: Training Language Models With Pause Tokens. Another example of how much low hanging fruit remains to be discovered in basic Large Language Model research: this team from Carnegie Mellon and Google Research note that, since LLMs get to run their neural networks once for each token of input and output, inserting “pause” tokens that don’t output anything at all actually gives them extra opportunities to “think” about their output. # 4:23 pm
An Interactive Intro to CRDTs (via) Superb interactive essay by Jake Lazaroff, providing a very clear explanation of how the fundamental mechanisms behind CRDTs (Conflict-free Replicated Data Types) work. The interactive explanatory demos are very neatly designed and a lot of fun to play with. # 3:10 pm
Translating Latin demonology manuals with GPT-4 and Claude (via) UC Santa Cruz history professor Benjamin Breen puts LLMs to work on historical texts. They do an impressive job of translating flaky OCRd text from 1599 Latin and 1707 Portuguese.
“It’s not about getting the AI to replace you. Instead, it’s asking the AI to act as a kind of polymathic research assistant to supply you with leads.” # 1:49 am