Simon Willison’s Weblog

Subscribe

Posts tagged llms, mlc in Aug, 2023

Filters: Year: 2023 × Month: Aug × llms × mlc × Sorted by date

WebLLM supports Llama 2 70B now. The WebLLM project from MLC uses WebGPU to run large language models entirely in the browser. They recently added support for Llama 2, including Llama 2 70B, the largest and most powerful model in that family.

To my astonishment, this worked! I used a M2 Mac with 64GB of RAM and Chrome Canary and it downloaded many GBs of data... but it worked, and spat out tokens at a slow but respectable rate of 3.25 tokens/second.

# 30th August 2023, 2:41 pm / ai, webassembly, generative-ai, llama, llms, mlc, webgpu

llm-mlc (via) My latest plugin for LLM adds support for models that use the MLC Python library—which is the first library I’ve managed to get to run Llama 2 with GPU acceleration on my M2 Mac laptop.

# 12th August 2023, 5:33 am / plugins, projects, ai, generative-ai, llms, mlc, llm

Types

Years

Months

Tags