Simon Willison’s Weblog

Subscribe

Entries tagged open-source

Filters: Type: entry × open-source × Sorted by date

Maybe Meta’s Llama claims to be open source because of the EU AI act

Visit Maybe Meta's Llama claims to be open source because of the EU AI act

I encountered a theory a while ago that one of the reasons Meta insist on using the term “open source” for their Llama models despite the Llama license not actually conforming to the terms of the Open Source Definition is that the EU’s AI act includes special rules for open source models without requiring OSI compliance.

[... 852 words]

A selfish personal argument for releasing code as Open Source

Visit A selfish personal argument for releasing code as Open Source

I’m the guest for the most recent episode of the Real Python podcast with Christopher Bailey, talking about Using LLMs for Python Development. We covered a lot of other topics as well—most notably my relationship with Open Source development over the years.

[... 464 words]

Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac

Visit Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac

There’s a whole lot of buzz around the new Qwen2.5-Coder Series of open source (Apache 2.0 licensed) LLM releases from Alibaba’s Qwen research team. On first impression it looks like the buzz is well deserved.

[... 697 words]

Interesting ideas in Observable Framework

Visit Interesting ideas in Observable Framework

Mike Bostock, Announcing: Observable Framework:

[... 2,123 words]

Talking about Open Source LLMs on Oxide and Friends

Visit Talking about Open Source LLMs on Oxide and Friends

I recorded an episode of the Oxide and Friends podcast on Monday, talking with Bryan Cantrill and Adam Leventhal about Open Source LLMs.

[... 1,995 words]

Financial sustainability for open source projects at GitHub Universe

Visit Financial sustainability for open source projects at GitHub Universe

I presented a ten minute segment at GitHub Universe on Wednesday, ambitiously titled Financial sustainability for open source projects.

[... 2,485 words]

LLM now provides tools for working with embeddings

Visit LLM now provides tools for working with embeddings

LLM is my Python library and command-line tool for working with language models. I just released LLM 0.9 with a new set of features that extend LLM to provide tools for working with embeddings.

[... 3,521 words]

Leaked Google document: “We Have No Moat, And Neither Does OpenAI”

Visit Leaked Google document: "We Have No Moat, And Neither Does OpenAI"

SemiAnalysis published something of a bombshell leaked document this morning: Google “We Have No Moat, And Neither Does OpenAI”.

[... 1,073 words]

Thoughts on AI safety in this era of increasingly powerful open source LLMs

This morning, VentureBeat published a story by Sharon Goldman: With a wave of new LLMs, open source AI is having a moment — and a red-hot debate. It covers the explosion in activity around openly available Large Language Models such as LLaMA—a trend I’ve been tracking in my own series LLMs on personal devices—and talks about their implications with respect to AI safety.

[... 782 words]

Working in public

Visit Working in public

I participated in a panel discussion this week for path to Citus Con, a series of Discord audio events that are happening in the run up to the Citus Con 2023 later this month.

[... 546 words]

Stanford Alpaca, and the acceleration of on-device large language model development

Visit Stanford Alpaca, and the acceleration of on-device large language model development

On Saturday 11th March I wrote about how Large language models are having their Stable Diffusion moment. Today is Monday. Let’s look at what’s happened in the past three days.

[... 2,055 words]

Support open source that you use by paying the maintainers to talk to your team

I think I’ve come up with a novel hack for the challenge of getting your company to financially support the open source projects that it uses: reach out to the maintainers and offer them generous speaking fees for remote talks to your engineering team.

[... 645 words]

Writing better release notes

Release notes are an important part of the open source process. I’ve been thinking about these a lot recently, and I’ve assembled some thoughts on how to do a better job with them.

[... 918 words]

How to build, test and publish an open source Python library

Visit How to build, test and publish an open source Python library

At PyGotham this year I presented a ten minute workshop on how to package up a new open source Python library and publish it to the Python Package Index. Here is the video and accompanying notes, which should make sense even without watching the talk.

[... 2,055 words]

Open source projects: consider running office hours

Back in December I decided to try something new for my Datasette open source project: Datasette Office Hours. The idea is simple: anyone can book a 25 minute conversation with me on a Friday to talk about the project. I’m interested in talking to people who are using Datasette, or who are considering using it, or who just want to have a chat.

[... 786 words]

Weeknotes: PG&E outages, and Open Source works!

My big focus this week was the PG&E outages project. I’m really pleased with how this turned out: the San Francisco Chronicle used data from it for their excellent PG&E outage interactive (mixing in data on wind conditions) and it earned a bunch of interest on Twitter and some discussion on Hacker News.

[... 452 words]

My JSK Fellowship: Building an open source ecosystem of tools for data journalism

I started a new chapter of my career last week: I began a year long fellowship with the John S. Knight Journalism Fellowships program at Stanford.

[... 876 words]

Datasette 0.28—and why master should always be releasable

It’s been quite a while since the last substantial release of Datasette. Datasette 0.27 came out all the way back in January.

[... 1,326 words]

sqlite-utils: a Python library and CLI tool for building SQLite databases

sqlite-utils is a combination Python library and command-line tool I’ve been building over the past six months which aims to make creating new SQLite databases as quick and easy as possible.

[... 1,237 words]

What’s the best way to keep track of changes to a project you’re not directly contributing to on github?

This is what GitHub’s “watch” feature is for: https://help.github.com/articles...

[... 35 words]

Is there an open source (or freely accessible) database of geofence coordinates for common places, such as cities or national parks?

Take a look at Flickr’s openly licensed shapefiles:

[... 59 words]

What is the ways to view the examples without download the example files in github?

If you can view the file on raw.github.com you can drop the first dot to view it on rawgithub.com—a free proxy service.

[... 107 words]

How could GitHub improve the password security of its users?

By doing exactly what they’re doing already: adding more sophisticated rate limiting, and preventing users from using common weak passwords.

[... 80 words]

Why doesn’t Quora open-source its search interface?

See my answer to Simon Willison’s answer to How come Quora hasn’t contributed any significant open source tools?

[... 32 words]

What are some good open source projects that VMware is directly part of?

To my knowledge they fund almost all of the development work on RabbitMQ, Redis and the Spring Java framework.

[... 38 words]

How come Quora hasn’t contributed any significant open source tools?

Releasing open source software is a lot of work. You need to extract it from your own proprietary systems, clean it up, document it, release it and them deal with support queries, incoming bug fixes, suggestions and feature requests.

[... 192 words]

For developers and Startup Founders, what software licenses would you prefer to use instead of open source options, if you had unlimited funding?

With unlimited funding... I’d still prefer to build my company on open source licensed components. It’s not about the price (heck, I’d use my unlimited funding to support the projects my company benefited from)—it’s about the freedom to understand exactly how everything works, fix stuff that doesn’t, and benefit from the community of others solving the same kinds of problems.

[... 91 words]

What are the most commonly used or most interesting open-source packages and software?

I’d say the open source browser engines, Gecko (Firefox) and WebKit (Safari, Chrome, iOS, Android) are probably some of the most important and widely used pieces of open source code these days.

[... 51 words]

To become a better developer ? To read more OR to create/contribute to open source projects?

Contribute to an existing project, rather than starting one yourself. There are a bunch of benefits:

[... 231 words]

Is there a free/open-source software source code search engine?

If you want to search through actual code in open source projects, GitHub search is fantastic https://github.com/search—e.g. here’s a search for all Ruby code that mentions oauth https://github.com/search?q=oaut...

[... 71 words]