Simon Willison’s Weblog

4 items tagged “vicuna”


Let’s be bear or bunny

The Machine Learning Compilation group (MLC) are my favourite team of AI researchers at the moment.

[... 599 words]

LLaVA: Large Language and Vision Assistant (via) Yet another multi-modal model combining a vision model (pre-trained CLIP ViT-L/14) and a LLaMA derivative model (Vicuna). The results I get from their demo are even more impressive than MiniGPT-4. Also includes a new training dataset, LLaVA-Instruct-150K, derived from GPT-4 and subject to the same warnings about the OpenAI terms of service. # 19th April 2023, 1:14 am

MiniGPT-4 (via) An incredible project with a poorly chosen name. A team from King Abdullah University of Science and Technology in Saudi Arabia combined Vicuna-13B (a model fine-tuned on top of Facebook’s LLaMA) with the BLIP-2 vision-language model to create a model that can conduct ChatGPT-style conversations around an uploaded image. The demo is very impressive, and the weights are available to download—45MB for MiniGPT-4, but you’ll need the much larger Vicuna and LLaMA weights as well. # 17th April 2023, 2:21 pm

Web LLM runs the vicuna-7b Large Language Model entirely in your browser, and it’s very impressive

A month ago I asked Could you train a ChatGPT-beating model for $85,000 and run it in a browser?. $85,000 was a hypothetical training cost for LLaMA 7B plus Stanford Alpaca. “Run it in a browser” was based on the fact that Web Stable Diffusion runs a 1.9GB Stable Diffusion model in a browser, so maybe it’s not such a big leap to run a small Large Language Model there as well.

[... 2276 words]