Petals (via) The challenge with large language models in the same scale ballpark as GPT-3 is that they’re large—really large. Far too big to run on a single machine at home. Petals is a fascinating attempt to address that problem: it works a little bit like BitTorrent, in that each user of Petal runs a subset of the overall language model on their machine and participates in a larger network to run inference across potentially hundreds of distributed GPUs. I tried it just now in Google Colab and it worked exactly as advertised, after downloading an 8GB subset of the 352GB BLOOM-176B model.
Recent articles
- Datasette Enrichments: a new plugin framework for augmenting your data - 1st December 2023
- llamafile is the new best way to run a LLM on your own computer - 29th November 2023
- Prompt injection explained, November 2023 edition - 27th November 2023
- I'm on the Newsroom Robots podcast, with thoughts on the OpenAI board - 25th November 2023
- Weeknotes: DevDay, GitHub Universe, OpenAI chaos - 22nd November 2023
- Deciphering clues in a news article to understand how it was reported - 22nd November 2023