<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: stanford</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/stanford.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-09-24T22:15:03+00:00</updated><author><name>Simon Willison</name></author><entry><title>Quoting Stanford CS221 Autumn 2025</title><link href="https://simonwillison.net/2025/Sep/24/stanford/#atom-tag" rel="alternate"/><published>2025-09-24T22:15:03+00:00</published><updated>2025-09-24T22:15:03+00:00</updated><id>https://simonwillison.net/2025/Sep/24/stanford/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://stanford-cs221.github.io/autumn2025/assignments/hw1_foundations/index.html"&gt;&lt;p&gt;[2 points] &lt;strong&gt;Learn basic NumPy operations with an AI tutor!&lt;/strong&gt; Use an AI chatbot (e.g., ChatGPT, Claude, Gemini, or Stanford AI Playground) to teach yourself how to do basic vector and matrix operations in NumPy (import numpy as np). AI tutors have become exceptionally good at creating interactive tutorials, and this year in CS221, we're testing how they can help you learn fundamentals more interactively than traditional static exercises.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://stanford-cs221.github.io/autumn2025/assignments/hw1_foundations/index.html"&gt;Stanford CS221 Autumn 2025&lt;/a&gt;, Problem 1: Linear Algebra&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/computer-science"&gt;computer-science&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/education"&gt;education&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/numpy"&gt;numpy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;



</summary><category term="computer-science"/><category term="education"/><category term="python"/><category term="stanford"/><category term="ai"/><category term="numpy"/><category term="generative-ai"/><category term="llms"/></entry><entry><title>GPUs Go Brrr</title><link href="https://simonwillison.net/2024/May/13/gpus-go-brrr/#atom-tag" rel="alternate"/><published>2024-05-13T04:08:46+00:00</published><updated>2024-05-13T04:08:46+00:00</updated><id>https://simonwillison.net/2024/May/13/gpus-go-brrr/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://hazyresearch.stanford.edu/blog/2024-05-12-tk"&gt;GPUs Go Brrr&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Fascinating, detailed low-level notes on how to get the most out of NVIDIA's H100 GPUs (currently selling for around $40,000 a piece) from the research team at Stanford who created FlashAttention, among other things.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The swizzled memory layouts are flat-out incorrectly documented, which took considerable time for us to figure out.&lt;/p&gt;
&lt;/blockquote&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=40337936"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nvidia"&gt;nvidia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpus"&gt;gpus&lt;/a&gt;&lt;/p&gt;



</summary><category term="stanford"/><category term="ai"/><category term="nvidia"/><category term="gpus"/></entry><entry><title>Train and run Stanford Alpaca on your own machine</title><link href="https://simonwillison.net/2023/Mar/16/train-and-run-stanford-alpaca-on-your-own-machine/#atom-tag" rel="alternate"/><published>2023-03-16T16:10:39+00:00</published><updated>2023-03-16T16:10:39+00:00</updated><id>https://simonwillison.net/2023/Mar/16/train-and-run-stanford-alpaca-on-your-own-machine/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://replicate.com/blog/replicate-alpaca"&gt;Train and run Stanford Alpaca on your own machine&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The team at Replicate managed to train their own copy of Stanford’s Alpaca—a fine-tuned version of LLaMA that can follow instructions like ChatGPT. Here they provide step-by-step instructions for recreating Alpaca yourself—running the training needs one or more A100s for a few hours, which you can rent through various cloud providers.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replicate"&gt;replicate&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fine-tuning"&gt;fine-tuning&lt;/a&gt;&lt;/p&gt;



</summary><category term="stanford"/><category term="ai"/><category term="generative-ai"/><category term="llama"/><category term="local-llms"/><category term="llms"/><category term="replicate"/><category term="fine-tuning"/></entry><entry><title>Stanford Alpaca, and the acceleration of on-device large language model development</title><link href="https://simonwillison.net/2023/Mar/13/alpaca/#atom-tag" rel="alternate"/><published>2023-03-13T19:19:09+00:00</published><updated>2023-03-13T19:19:09+00:00</updated><id>https://simonwillison.net/2023/Mar/13/alpaca/#atom-tag</id><summary type="html">
    &lt;p&gt;On Saturday 11th March I wrote about how &lt;a href="https://simonwillison.net/2023/Mar/11/llama/"&gt;Large language models are having their Stable Diffusion moment&lt;/a&gt;. Today is Monday. Let's look at what's happened in the past three days.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Later on Saturday: Artem Andreenko reports that &lt;code&gt;llama.cpp&lt;/code&gt; can &lt;a href="https://twitter.com/miolini/status/1634982361757790209"&gt;run the 4-bit quantized 7B LLaMA language model model on a 4GB RaspberryPi&lt;/a&gt; - at 10 seconds per token, but still hugely impressive.&lt;/li&gt;
&lt;li&gt;Sunday 12th March: &lt;a href="https://twitter.com/cocktailpeanut"&gt;cocktailpeanut&lt;/a&gt; releases &lt;a href="https://cocktailpeanut.github.io/dalai/"&gt;Dalai&lt;/a&gt;, a "dead simple way to run LLaMA on your computer": &lt;code&gt;npx dalai llama&lt;/code&gt; and &lt;code&gt;npx dalai serve&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;13th March (today): Anish Thite reports &lt;code&gt;llama.cpp&lt;/code&gt; running &lt;a href="https://twitter.com/thiteanish/status/1635188333705043969"&gt;on a Pixel 6 phone&lt;/a&gt; (26 seconds per token). &lt;strong&gt;Update 14th March:&lt;/strong&gt; Now &lt;a href="https://twitter.com/ggerganov/status/1635605532726681600"&gt;1 second per token&lt;/a&gt; on an older Pixel 5!&lt;/li&gt;
&lt;li&gt;Also today: a team at Stanford released &lt;a href="https://crfm.stanford.edu/2023/03/13/alpaca.html"&gt;Alpaca: A Strong Open-Source Instruction-Following Model&lt;/a&gt; - fine-tuned from the LLaMA 7B model.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When I talked about a "Stable Diffusion moment" this is the kind of thing I meant: the moment this stuff is available for people to experiment with, things accelerate.&lt;/p&gt;
&lt;p&gt;I'm going to dive into Alpaca in detail.&lt;/p&gt;
&lt;h4&gt;Stanford's Alpaca&lt;/h4&gt;
&lt;p&gt;Here's the introduction to &lt;a href="https://crfm.stanford.edu/2023/03/13/alpaca.html"&gt;the Alpaca announcement&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We introduce Alpaca 7B, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations. Alpaca behaves similarly to OpenAI’s text-davinci-003, while being surprisingly small and easy/cheap to reproduce (&amp;lt;600$).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The biggest weakness in the LLaMA models released by Meta research last month is their lack of instruction-tuning.&lt;/p&gt;
&lt;p&gt;A language model is a sentence completion engine. You give it a sequence of words, "The first man on the moon was", and it completes that sentence, hopefully with useful content.&lt;/p&gt;
&lt;p&gt;One of the great innovations from OpenAI was their application of &lt;a href="https://openai.com/research/instruction-following"&gt;instruction tuning&lt;/a&gt; to GPT-3:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;To make our models safer, more helpful, and more aligned, we use an existing technique called reinforcement learning from human feedback (RLHF). On prompts submitted by our customers to the API, our labelers provide demonstrations of the desired model behavior, and rank several outputs from our models. We then use this data to fine-tune GPT-3.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Prior to this, you had to think very carefully about how to construct your prompts. Thanks to instruction tuning you can be a lot more, well, human in the way you interact with the model. "Write me a poem about pandas!" now works as a prompt, instead of "Here is a poem about pandas:".&lt;/p&gt;
&lt;p&gt;The LLaMA models had not been through this process. The &lt;a href="https://github.com/facebookresearch/llama/blob/57b0eb62de0636e75af471e49e2f1862d908d9d8/FAQ.md#2-generations-are-bad"&gt;LLaMA FAQ&lt;/a&gt; acknowledges this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Keep in mind these models are not finetuned for question answering. As such, they should be prompted so that the expected answer is the natural continuation of the prompt. [...] Overall, always keep in mind that models are very sensitive to prompts (particularly when they have not been finetuned).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is an enormous usability problem.&lt;/p&gt;
&lt;p&gt;One of my open questions about LLaMA was how difficult and expensive it would be to fine-tune it such that it could respond better to instructions.&lt;/p&gt;
&lt;p&gt;Thanks to the team at Stanford we now have an answer: 52,000 training samples and $100 of training compute! From their blog post:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Fine-tuning a 7B LLaMA model took 3 hours on 8 80GB A100s, which costs less than $100 on most cloud compute providers.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Something that stuns me about Alpaca is the quality they claim to be able to get from the 7B model - the smallest of the LLaMA models, and the one which has been seen running (albeit glacially slowly) on a RaspberryPi and a mobile phone! Here's one example from their announcement:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2023/alpaca-email.png" alt="Enter your instruction and press enter: Write an e-mail to congratulate new Stanford admits and mention that you are exited about meeting all of them in person Stanford-Alpaca-7B: An Open-Source Instruction-Following Language Model Dear Stanford Admits, Congratulations on your admission to Stanford! I am so excited to meet all of you in person. As a current student at Stanford, I am looking forward to welcoming you to the Stanford community and showing you all that the university has to offer. I wish you all the best in your future endeavors and I can't wait to see you on campus! Sincerely, Your Name" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I would be impressed to see this from the 65B (largest) LLaMA model - but getting this from 7B is spectacular.&lt;/p&gt;
&lt;h4&gt;Still not for commercial usage&lt;/h4&gt;
&lt;p&gt;I'll quote the Stanford announcement on this in full:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We emphasize that Alpaca is intended &lt;strong&gt;only for academic research&lt;/strong&gt; and any &lt;strong&gt;commercial use is prohibited&lt;/strong&gt;. There are three factors in this decision: First, Alpaca is based on LLaMA, which has a non-commercial &lt;a href="https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform"&gt;license&lt;/a&gt;, so we necessarily inherit this decision. Second, the instruction data is based OpenAI's text-davinci-003, whose &lt;a href="https://openai.com/policies/terms-of-use"&gt;terms of use&lt;/a&gt; prohibit developing models that compete with OpenAI. Finally, we have not designed adequate safety measures, so Alpaca is not ready to be deployed for general use.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So it's still not something we can use to build commercial offerings - but for personal research and tinkering it's yet another huge leap forwards.&lt;/p&gt;
&lt;h4 id="takeaways"&gt;What does this demonstrate?&lt;/h4&gt;
&lt;p&gt;The license of the LLaMA model doesn't bother me too much. What's exciting to me is what this all proves:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;LLaMA itself shows that it's possible to train a GPT-3 class language model using openly available resources. The &lt;a href="https://arxiv.org/abs/2302.13971"&gt;LLaMA paper&lt;/a&gt; includes details of the training data, which is entirely from publicly available sources (which include CommonCrawl, GitHub, Wikipedia, ArXiv and StackExchange).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/ggerganov/llama.cpp"&gt;llama.cpp&lt;/a&gt; shows that you can then use some tricks to run that language model on consumer hardware - apparently anything with 4GB or more of RAM is enough to at least get it to start spitting out tokens!&lt;/li&gt;
&lt;li&gt;Alpaca shows that you can apply fine-tuning with a feasible sized set of examples (52,000) and cost ($100) such that even the smallest of the LLaMA models - the 7B one, which can compress down to a 4GB file with 4-bit quantization - provides results that compare well to cutting edge &lt;code&gt;text-davinci-003&lt;/code&gt; in initial human evaluation.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;One thing that's worth noting: the Alpaca 7B comparison likely used the full-sized 13.48GB 16bit floating point 7B model, not the 4GB smaller 4bit floating point model used by &lt;code&gt;llama.cpp&lt;/code&gt;. I've not yet seen a robust comparison of quality between the two.&lt;/p&gt;
&lt;h4&gt;Exploring the Alpaca training data with Datasette Lite&lt;/h4&gt;
&lt;p&gt;The Alpaca team released the 52,000 fine-tuning instructions they used as &lt;a href="https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json"&gt;a 21.7MB JSON file&lt;/a&gt; in their GitHub repository.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://simonwillison.net/2022/May/4/datasette-lite/"&gt;Datasette Lite&lt;/a&gt; tool has the ability to fetch JSON from GitHub and load it into an in-browser SQLite database. Here's the URL to do that:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://lite.datasette.io/?json=https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json"&gt;https://lite.datasette.io/?json=https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This will let you browse the 52,000 examples in your browser.&lt;/p&gt;
&lt;p&gt;But we can do a step better than that: here's a SQL query that runs LIKE queries to search through those examples, considering all three text columns:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;select&lt;/span&gt; instruction, input, output &lt;span class="pl-k"&gt;from&lt;/span&gt; alpaca_data
&lt;span class="pl-k"&gt;where&lt;/span&gt; instruction &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; input &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; output &lt;span class="pl-k"&gt;like&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; :search &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;%&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-k"&gt;order by&lt;/span&gt; random()&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I'm using &lt;code&gt;order by random()&lt;/code&gt; because why not? It's more fun to explore that way.&lt;/p&gt;
&lt;p&gt;The following link will both load the JSON file and populate and execute that SQL query, plus allow you to change the search term using a form in your browser:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://lite.datasette.io/?json=https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json#/data?sql=select+instruction%2C+input%2C+output+from+alpaca_data%0Awhere+instruction+%7C%7C+%27+%27+%7C%7C+input+%7C%7C+%27+%27+%7C%7C+output+like+%27%25%27+%7C%7C+%3Asearch+%7C%7C+%27%25%27%0Aorder+by+random%28%29&amp;amp;search=occam"&gt;https://lite.datasette.io/?json=https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json#/data?sql=select+instruction%2C+input%2C+output+from+alpaca_data%0Awhere+instruction+%7C%7C+%27+%27+%7C%7C+input+%7C%7C+%27+%27+%7C%7C+output+like+%27%25%27+%7C%7C+%3Asearch+%7C%7C+%27%25%27%0Aorder+by+random%28%29&amp;amp;search=occam&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2023/alpaca-datasette-lite.jpg" alt="Screenshot of Datasette executing that SQL query, retruning three results that match 'occam'" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4&gt;What's next?&lt;/h4&gt;
&lt;p&gt;This week is likely to be wild. OpenAI are rumored to have a big announcement on Tuesday - possibly GPT-4? And I've heard rumors of announcements from both Anthropic and Google this week as well.&lt;/p&gt;
&lt;p&gt;I'm still more excited about seeing what happens next with LLaMA. Language models on personal devices is happening so much faster than I thought it would.&lt;/p&gt;
&lt;h4 id="bonus-training-data"&gt;Bonus: The source of that training data? GPT-3!&lt;/h4&gt;
&lt;p&gt;Here's a fascinating detail: Those 52,000 samples they used to fine-tune the model? Those were the result of a prompt they ran against GPT-3 itself! Here's &lt;a href="https://github.com/tatsu-lab/stanford_alpaca/blob/da37bb2ecab37cae022dd07aa3ff861c446fb614/prompt.txt"&gt;the prompt&lt;/a&gt; they used:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;You are asked to come up with a set of 20 diverse task instructions. These task instructions will be given to a GPT model and we will evaluate the GPT model for completing the instructions.

Here are the requirements:
1. Try not to repeat the verb for each instruction to maximize diversity.
2. The language used for the instruction also should be diverse. For example, you should combine questions with imperative instrucitons.
3. The type of instructions should be diverse. The list should include diverse types of tasks like open-ended generation, classification, editing, etc.
2. A GPT language model should be able to complete the instruction. For example, do not ask the assistant to create any visual or audio output. For another example, do not ask the assistant to wake you up at 5pm or set a reminder because it cannot perform any action.
3. The instructions should be in English.
4. The instructions should be 1 to 2 sentences long. Either an imperative sentence or a question is permitted.
5. You should generate an appropriate input to the instruction. The input field should contain a specific example provided for the instruction. It should involve realistic data and should not contain simple placeholders. The input should provide substantial content to make the instruction challenging but should ideally not exceed 100 words.
6. Not all instructions require input. For example, when a instruction asks about some general information, "what is the highest peak in the world", it is not necssary to provide a specific context. In this case, we simply put "&amp;lt;noinput&amp;gt;" in the input field.
7. The output should be an appropriate response to the instruction and the input. Make sure the output is less than 100 words.

List of 20 tasks:
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;Then they include three random example instructions from &lt;a href="https://github.com/tatsu-lab/stanford_alpaca/blob/da37bb2ecab37cae022dd07aa3ff861c446fb614/seed_tasks.jsonl"&gt;a list of 175&lt;/a&gt; they had prepared by hand. The completed prompt sent to OpenAI would include the above instructions followed by something like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;###
1. Instruction: Explain the following idiom to me, and try to give me some examples.
1. Input:
black sheep
1. Output:
Meaning: An outcast. Someone who doesn’t fit in with the rest of the crowd. They take pride in being different. Thinks for themselves and doesn’t care what no one else has to say. They tend to ride their own wave and are usually loners because no one understands them, but its okay because they like it that way.
Example: He’s the black sheep of the family.

###
2. Instruction: Generate a haiku using the following word:
2. Input:
summer
2. Output:
The chill, worming in
Shock, pleasure, bursting within
Summer tongue awakes

###
3. Instruction: Recommend a movie for me to watch during the weekend and explain the reason.
3. Input:
3. Output:
I would recommend the movie "The Shawshank Redemption" because it is an excellent movie that is both moving and inspiring. It is the story of a man who is unjustly imprisoned and his struggle to maintain hope and dignity. It is a great film to watch over the weekend because it will make you think about the human capacity for resilience and hope.

###
4. Instruction:
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;GPT-3 would then fill in the rest. You can try this &lt;a href="https://simonwillison.net/2022/Jun/5/play-with-gpt3/"&gt;in the GPT-3 Playground&lt;/a&gt; to see it in action (paste &lt;a href="https://gist.github.com/simonw/fed0bf6d8237920f2ee42e4fc82c260f"&gt;from here&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/tatsu-lab/stanford_alpaca/blob/da37bb2ecab37cae022dd07aa3ff861c446fb614/generate_instruction.py"&gt;the Python script&lt;/a&gt; that assembles that all together.&lt;/p&gt;
&lt;p&gt;They spent $500 on OpenAI credits to assemble the 52,000 examples they used to fine-tune their model.&lt;/p&gt;
&lt;p&gt;As they note in their announcement, generating examples in this way is actually mentioned in the OpenAI &lt;a href="https://openai.com/policies/terms-of-use"&gt;terms of use&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You may not [...] (iii) use the Services to develop foundation models or other large scale models that compete with OpenAI&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There's a related concept to this called &lt;a href="https://ssg.aalto.fi/research/projects/mlsec/model-extraction/"&gt;Model Extraction&lt;/a&gt;, where people build new models that emulate the behaviour of others by firing large numbers of examples through the other model and training a new one based on the results.&lt;/p&gt;
&lt;p&gt;I don't think the way Alpaca was trained quite counts as a classic Model Extraction attack, but it certainly echoes one.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-3"&gt;gpt-3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fine-tuning"&gt;fine-tuning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama-cpp"&gt;llama-cpp&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/paper-review"&gt;paper-review&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="open-source"/><category term="stanford"/><category term="ai"/><category term="gpt-3"/><category term="generative-ai"/><category term="llama"/><category term="local-llms"/><category term="llms"/><category term="fine-tuning"/><category term="llama-cpp"/><category term="paper-review"/></entry><entry><title>Quoting Alpaca: A Strong Open-Source Instruction-Following Model</title><link href="https://simonwillison.net/2023/Mar/13/stanford-alpaca/#atom-tag" rel="alternate"/><published>2023-03-13T18:18:37+00:00</published><updated>2023-03-13T18:18:37+00:00</updated><id>https://simonwillison.net/2023/Mar/13/stanford-alpaca/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://crfm.stanford.edu/2023/03/13/alpaca.html"&gt;&lt;p&gt;We introduce Alpaca 7B, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations. Alpaca behaves similarly to OpenAI’s text-davinci-003, while being surprisingly small and easy/cheap to reproduce (&amp;lt;600$).&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://crfm.stanford.edu/2023/03/13/alpaca.html"&gt;Alpaca: A Strong Open-Source Instruction-Following Model&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fine-tuning"&gt;fine-tuning&lt;/a&gt;&lt;/p&gt;



</summary><category term="stanford"/><category term="ai"/><category term="generative-ai"/><category term="llama"/><category term="llms"/><category term="fine-tuning"/></entry><entry><title>Weeknotes: Working on my screenplay</title><link href="https://simonwillison.net/2020/May/14/weeknotes-working-my-screenplay/#atom-tag" rel="alternate"/><published>2020-05-14T04:53:46+00:00</published><updated>2020-05-14T04:53:46+00:00</updated><id>https://simonwillison.net/2020/May/14/weeknotes-working-my-screenplay/#atom-tag</id><summary type="html">
    &lt;p&gt;I'm taking an Introduction to Screenwriting course with Adam Tobin at Stanford, and my partial screenplay is due this week. I'm pulling together some scenes that tell the story of the Russian 1917 February Revolution and the fall of the Tsar through the lens of the craftsmen working on the Tsar's last Fabergé egg. So I've not been spending much time on anything else.&lt;/p&gt;

&lt;p&gt;Some brief bullet points for this week's software projects:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;Released version 0.1 of &lt;a href="https://github.com/simonw/datasette-media"&gt;datasette-media&lt;/a&gt;, a new plugin that allows Datasette to serve files from disk based on executing a SQL query to find the file to return. I'm building it to help make &lt;a href="https://github.com/dogsheep/photos-to-sqlite"&gt;photos-to-sqlite&lt;/a&gt; more immediately useful.&lt;/li&gt;&lt;li&gt;Released &lt;a href="https://datasette.readthedocs.io/en/stable/changelog.html#v0-42"&gt;Datasette 0.42&lt;/a&gt; with improved (and &lt;a href="https://datasette.readthedocs.io/en/stable/internals.html#database-execute"&gt;now documented&lt;/a&gt;) internal methods to allow plugins to execute read-only SQL queries. I needed these for &lt;code&gt;datasette-media&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;Released &lt;a href="https://sqlite-utils.readthedocs.io/en/stable/changelog.html#v2-9"&gt;sqlite-utils 2.9&lt;/a&gt; with new CLI commands &lt;code&gt;sqlite-utils drop-table&lt;/code&gt; and &lt;code&gt;sqlite-utils drop-view&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;Released &lt;a href="https://sqlite-utils.readthedocs.io/en/stable/changelog.html#v2-9-1"&gt;sqlite-utils 2.9.1&lt;/a&gt; with a tiny cosmetic improvement: the &lt;a href="https://pypi.org/project/sqlite-utils/"&gt;PyPI project page&lt;/a&gt; now shows project links! See &lt;a href="https://github.com/simonw/til/blob/master/pypi/project-links.md"&gt;this TIL&lt;/a&gt; for details.&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;I've also started adding changelog badges to various projects, showing the latest release version according to GitHub and linking to that project's changelog. &lt;a href="https://github.com/simonw/datasette/blob/master/README.md"&gt;Datasette&lt;/a&gt;, &lt;a href="https://github.com/dogsheep/photos-to-sqlite/blob/master/README.md"&gt;photos-to-sqlite&lt;/a&gt;, &lt;a href="https://github.com/simonw/sqlite-utils/blob/master/README.md"&gt;sqlite-utils&lt;/a&gt; all have these now.&lt;/p&gt;

&lt;h3&gt;TIL this week&lt;/h3&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/python/build-official-docs.md"&gt;Build the official Python documentation locally&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/markdown/converting-to-markdown.md"&gt;Converting HTML and rich-text to Markdown&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/pypi/project-links.md"&gt;Adding project links to PyPI&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/screen-writing"&gt;screen-writing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/til"&gt;til&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="stanford"/><category term="screen-writing"/><category term="datasette"/><category term="weeknotes"/><category term="til"/></entry><entry><title>Weeknotes: Improv at Stanford, planning Datasette Cloud</title><link href="https://simonwillison.net/2020/Jan/14/stanford-planning-datasette-cloud/#atom-tag" rel="alternate"/><published>2020-01-14T00:22:18+00:00</published><updated>2020-01-14T00:22:18+00:00</updated><id>https://simonwillison.net/2020/Jan/14/stanford-planning-datasette-cloud/#atom-tag</id><summary type="html">
    &lt;p&gt;Last week was the first week of the quarter at Stanford - which is called "shopping week" here because students are expected to try different classes to see which ones they are going to stick with.&lt;/p&gt;

&lt;p&gt;I've settled on three classes this quarter: &lt;a href="https://explorecourses.stanford.edu/search?q=taps+103"&gt;Beginning Improvising&lt;/a&gt;, &lt;a href="https://explorecourses.stanford.edu/search?q=DESINST%20240"&gt;Designing Machine Learning&lt;/a&gt; and &lt;a href="https://explorecourses.stanford.edu/search?q=STRAMGT%20353"&gt;Entrepreneurship: Formation of New Ventures&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Beginning Improvising&lt;/strong&gt; is the Stanford improv theater course. It's a big time commitment: three two-hours sessions a week for ten weeks is nearly 60 hours of improv!&lt;/p&gt;

&lt;p&gt;It's already proving to be really interesting though: it turns out the course is a thinly disguised applied psychology course.&lt;/p&gt;

&lt;p&gt;Improv is about creating a creative space for other people to shine. The applications to professional teamwork are obvious and fascinating to me. I'll probably write more about this as the course continues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Designing Machine Learning&lt;/strong&gt; is a class at the Stanford d.School taught by &lt;a href="https://twitter.com/michellercarney"&gt;Michelle Carney&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/emilykathryn/"&gt;Emily Callaghan&lt;/a&gt;. It focuses on multidisciplinary applications of machine learning, mixing together students from many different disciplines around Stanford.&lt;/p&gt;

&lt;p&gt;I took a &lt;a href="https://simonwillison.net/2018/Oct/29/transfer-learning/"&gt;fast.ai deep learning course&lt;/a&gt; last year which gave me a basic understanding of the code size of neural networks, but I'm much more interestind in figuring out applications so this seems like a much more interesting option than a more code-focused course.&lt;/p&gt;

&lt;p&gt;The class started out building some initial models using &lt;a href="https://teachablemachine.withgoogle.com/"&gt;Google's Teachable Machine tool&lt;/a&gt;, which is &lt;em&gt;fascinating&lt;/em&gt;. It lets you train transfer learning models for image, audio and posture recognition entirely in your browser - no data is transferred to Google's servers at all. You can then export those models and use them with a variety of different libraries - I've got them to work with both JavaScript and Python already.&lt;/p&gt;

&lt;p&gt;I'm taking &lt;strong&gt;Entrepreneurship: Formation of New Ventures&lt;/strong&gt; because of the rave reviews I heard from other JSK fellows who took it last quarter. It's a classic case-study business school class: each session features a guest speaker who is a successful entrepreneur, and the class discusses their case for the first two thirds of the section while they listen in - then finds out how well the discussion matched to what actually happened.&lt;/p&gt;

&lt;h3&gt;Planning Datasette Cloud&lt;/h3&gt;

&lt;p&gt;Shopping week kept me pretty busy so I've not done much actual development over the past week, but I have started planning out and researching my next major project, which I'm currently calling &lt;em&gt;Datasette Cloud&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Datasette Cloud will be an invite-only hosted SaaS version of &lt;a href="https://datasette.readthedocs.io/"&gt;Datasette&lt;/a&gt;. It's designed to help get news organizations on board with the software without having to talk them through figuring out their own hosting, so I can help them solve real problems and learn more about how the ecosystem should evolve to support them.&lt;/p&gt;

&lt;p&gt;I'd love to be able to run this on serverless hosting platforms like Google Cloud Run or Heroku, but sadly those tools aren't an option to me due to a key problem: I'm trying to build a &lt;em&gt;stateful&lt;/em&gt; service (SQLite databases need to live on a local disk) in 2020.&lt;/p&gt;

&lt;p&gt;I posed this challenge &lt;a href="https://twitter.com/simonw/status/1182077259839991808"&gt;on Twitter&lt;/a&gt; back in October:&lt;/p&gt;

&lt;blockquote class="twitter-tweet"&gt;&lt;p lang="en" dir="ltr"&gt;What&amp;#39;s the easiest way of running a stateful web application these days?&lt;br /&gt;&lt;br /&gt;Stateful as in it supports a process which can accept web requests and is allowed to write to a durable disk&lt;br /&gt;&lt;br /&gt;So not Heroku/Zeit Now/Cloud Run etc&lt;/p&gt;- Simon Willison (@simonw) &lt;a href="https://twitter.com/simonw/status/1182077259839991808?ref_src=twsrc%5Etfw"&gt;October 9, 2019&lt;/a&gt;&lt;/blockquote&gt;

&lt;p&gt;I've been exploring my options since then, and I think I've settled on a decidedly 2010-era way of doing this: I'm going to run my own instances! So I've been exploring hosting Datasette on both AWS Lightsail and Digital Ocean Droplets over the past few months.&lt;/p&gt;

&lt;p&gt;My current plan is to have each Datasette Cloud account run as a Datasette instance in its own Docker container, primarily to ensure filesystem isolation: different accounts must not be able to see each other's database files.&lt;/p&gt;

&lt;p&gt;I started &lt;a href="https://twitter.com/simonw/status/1216468790508015616"&gt;another discussion about this&lt;/a&gt; on Twitter and had several recommendations for &lt;a href="https://docs.traefik.io/"&gt;Traefik&lt;/a&gt; as a load balancer for assigning hostnames to different Docker containers, which is exactly what I need to do.&lt;/p&gt;

&lt;p&gt;So this afternoon I made my way through Digital Ocean's outstanding tutorial &lt;a href="https://www.digitalocean.com/community/tutorials/how-to-use-traefik-as-a-reverse-proxy-for-docker-containers-on-ubuntu-18-04"&gt;How To Use Traefik as a Reverse Proxy for Docker Containers on Ubuntu 18.04&lt;/a&gt; and I think I've convinced myself that this is a smart way forward.&lt;/p&gt;

&lt;p&gt;So, mostly a research week but I've got a solid plan for my next steps.&lt;/p&gt;

&lt;h3&gt;This week's Niche Museums&lt;/h3&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/90"&gt;Jelly Belly Factory&lt;/a&gt; in Fairfield, CA&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/91"&gt;Bevolo Gas Light Museum&lt;/a&gt; in New Orleans, LA&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/92"&gt;Museo de las Misiones de Baja California&lt;/a&gt; in Loreto&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/93"&gt;Fort Point&lt;/a&gt; in San Francisco, CA&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/94"&gt;Donner Memorial State Park Visitor Center&lt;/a&gt; in Nevada County, CA&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/95"&gt;Anja Community Reserve&lt;/a&gt; in Madagascar&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.niche-museums.com/browse/museums/96"&gt;Palace of Fine Arts&lt;/a&gt; in San Francisco, CA&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;I also finally got around to &lt;a href="https://www.niche-museums.com/map"&gt;implementing a map&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/docker"&gt;docker&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsk"&gt;jsk&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digitalocean"&gt;digitalocean&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="stanford"/><category term="docker"/><category term="jsk"/><category term="weeknotes"/><category term="datasette-cloud"/><category term="digitalocean"/></entry><entry><title>Weeknotes: first week of Stanford classes</title><link href="https://simonwillison.net/2019/Sep/30/weeknotes-first-week-stanford/#atom-tag" rel="alternate"/><published>2019-09-30T16:28:12+00:00</published><updated>2019-09-30T16:28:12+00:00</updated><id>https://simonwillison.net/2019/Sep/30/weeknotes-first-week-stanford/#atom-tag</id><summary type="html">
    &lt;p&gt;One of the benefits of &lt;a href="https://simonwillison.net/2019/Sep/10/jsk-fellowship/"&gt;the JSK fellowship&lt;/a&gt; is that I can take classes and lectures at Stanford, on a somewhat ad-hoc basis (I don’t take exams or earn credits).&lt;/p&gt;
&lt;p&gt;With thousands of courses to chose from, figuring out how best to take advantage of this isn’t at all easy - especially since I want to spend a big portion of my time focusing on my fellowship project.&lt;/p&gt;
&lt;p&gt;This week was the first week of classes, which Stanford calls “shopping week” - because students are encouraged to try out lots of different things and literally walk out half way through a lecture if they decide it’s not for them! Feels really rude to me, but apparently that’s how it works here.&lt;/p&gt;
&lt;p&gt;For this term I’ve settled on four classes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Strategic Communications&lt;/strong&gt;, at the Stanford Graduate School of Business. This is an extremely highly regarded course on public speaking and effective written communication. As you might expect from a class on public speaking the lectures themselves have been case studies in how to communicate well. I’ve given dozens of conference talks and I’m already learning a huge amount from this that will help me perform better in the future.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Classical Guitar&lt;/strong&gt;. I’m taking this with three other fellows. It turns out my cheap acoustic guitar (bought on an impulse a couple of years ago from Amazon Prime Now) isn’t the correct instrument for this class (&lt;a href="https://en.wikipedia.org/wiki/Classical_guitar"&gt;Classical Guitars&lt;/a&gt; are nylon stringed and a different shape) but the instructor thinks it will be fine for the moment. Great opportunity to do something musical!&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Biostatistics&lt;/strong&gt;. I want to firm up my fundamental knowledge of statistics, and I figured learning it from the biology department would be much more interesting than the corresponding maths or computer science classes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Media Innovation&lt;/strong&gt;. This is a lunchtime series of guest lectures from different professionals in different parts of the media industry. As such it doesn’t have much homework (wow, Stanford courses have a lot of homework) which makes it a good fit for my schedule, and the variety of speakers look to be really informative.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Combined with the JSK afternoon sessions on Monday, Wednesday and Friday I’ll be on campus every weekday, which will hopefully help me build a schedule that incorporates plenty of useful conversations with people about my project, plus actual time to get some code written.&lt;/p&gt;
&lt;p&gt;… what with all the shopping for classes, I wrote almost no code at all this week!&lt;/p&gt;
&lt;p&gt;I did some experimentation with &lt;a href="http://www.structlog.org/"&gt;structlog&lt;/a&gt; - I have an unfinished module which can write structlog entries to a SQLite database using &lt;a href="https://sqlite-utils.readthedocs.io/"&gt;sqlite-utils&lt;/a&gt; (&lt;a href="https://gist.github.com/simonw/3498fadbc9d8aea3967bdb4cddbf48d8"&gt;here’s a Gist&lt;/a&gt;) and I’ve been messing around with Python threads in a Jupyter notebook as part of ongoing research into &lt;a href="https://github.com/simonw/datasette/issues/569"&gt;smarter connection pooling&lt;/a&gt; for Datasette but aside from that I’ve been concentrating on figuring out Stanford.&lt;/p&gt;

&lt;h3&gt;Books&lt;/h3&gt;

&lt;p&gt;Stanford classes come with all sorts of required reading, but I’ve also made some progress on &lt;a href="https://abookapart.com/products/just-enough-research"&gt;Just Enough Research&lt;/a&gt; by Erika Hall (&lt;a href="https://simonwillison.net/2019/Sep/20/weeknotes-design-thinking-genome-sqlite/"&gt;mentioned last week&lt;/a&gt;). I’m about half way through and it’s fantastic - really fun to read and packed with useful tips on getting the most out of user interviews and associated techniques. Hopefully I’ll get to start putting it into practice next week!&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/music"&gt;music&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/reading"&gt;reading&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/speaking"&gt;speaking&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsk"&gt;jsk&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="music"/><category term="reading"/><category term="speaking"/><category term="stanford"/><category term="jsk"/><category term="weeknotes"/></entry><entry><title>JSK Journalism Fellowships names Class of 2019-2020 (and I'm in it!)</title><link href="https://simonwillison.net/2019/May/1/jsk-fellowship/#atom-tag" rel="alternate"/><published>2019-05-01T16:43:54+00:00</published><updated>2019-05-01T16:43:54+00:00</updated><id>https://simonwillison.net/2019/May/1/jsk-fellowship/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://jsk.stanford.edu/news-notes/2019/jsk-journalism-fellowships-names-class-of-2019-2020/"&gt;JSK Journalism Fellowships names Class of 2019-2020 (and I&amp;#x27;m in it!)&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
In personal news... I’ve been accepted for a ten month journalism fellowship at Stanford (starting September)! My work there will involve “Improving the impact of investigative stories by expanding the open-source ecosystem of tools that allows journalists to share the underlying data”.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/simonw/status/1123624552867565569"&gt;@simonw&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/data-journalism"&gt;data-journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/journalism"&gt;journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsk"&gt;jsk&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/personal-news"&gt;personal-news&lt;/a&gt;&lt;/p&gt;



</summary><category term="data-journalism"/><category term="journalism"/><category term="stanford"/><category term="datasette"/><category term="jsk"/><category term="personal-news"/></entry><entry><title>VectorMagic</title><link href="https://simonwillison.net/2007/Oct/28/vectormagic/#atom-tag" rel="alternate"/><published>2007-10-28T11:46:44+00:00</published><updated>2007-10-28T11:46:44+00:00</updated><id>https://simonwillison.net/2007/Oct/28/vectormagic/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://vectormagic.stanford.edu/"&gt;VectorMagic&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Neat online tool (with a Flex frontend) for tracing bitmap images in to vectors, based on research at the Stanford AI lab.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/flash"&gt;flash&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/flex"&gt;flex&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/graphics"&gt;graphics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/images"&gt;images&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/stanford"&gt;stanford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vectormagic"&gt;vectormagic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vectors"&gt;vectors&lt;/a&gt;&lt;/p&gt;



</summary><category term="flash"/><category term="flex"/><category term="graphics"/><category term="images"/><category term="stanford"/><category term="vectormagic"/><category term="vectors"/></entry></feed>