experimental-phi3-webgpu (via) Run Microsoft’s excellent Phi-3 model directly in your browser, using WebGPU so didn’t work in Firefox for me, just in Chrome.
It fetches around 2.1GB of data into the browser cache on first run, but then gave me decent quality responses to my prompts running at an impressive 21 tokens a second (M2, 64GB).
I think Phi-3 is the highest quality model of this size, so it’s a really good fit for running in a browser like this.
Recent articles
- My AI/LLM predictions for the next 1, 3 and 6 years, for Oxide and Friends - 10th January 2025
- Weeknotes: Starting 2025 a little slow - 4th January 2025
- I still don't think companies serve you ads based on spying through your microphone - 2nd January 2025