Simon Willison’s Weblog

Subscribe
Atom feed for tom-macwright

13 items tagged “tom-macwright”

2024

A warning about tiktoken, BPE, and OpenAI models. Tom MacWright warns that OpenAI's tiktoken Python library has a surprising performance profile: it's superlinear with the length of input, meaning someone could potentially denial-of-service you by sending you a 100,000 character string if you're passing that directly to tiktoken.encode().

There's an open issue about this (now over a year old), so for safety today it's best to truncate on characters before attempting to count or truncate using tiktoken.

# 21st November 2024, 6:13 am / openai, tom-macwright, security, python

Building technology in startups is all about having the right level of tech debt. If you have none, you’re probably going too slow and not prioritizing product-market fit and the important business stuff. If you get too much, everything grinds to a halt. Plus, tech debt is a “know it when you see it” kind of thing, and I know that my definition of “a bunch of tech debt” is, to other people, “very little tech debt.”

Tom MacWright

# 3rd November 2024, 4:36 pm / technical-debt, tom-macwright

But [LLM assisted programming] does make me wonder whether the adoption of these tools will lead to a form of de-skilling. Not even that programmers will be less skilled, but that the job will drift from the perception and dynamics of a skilled trade to an unskilled trade, with the attendant change - decrease - in pay. Instead of hiring a team of engineers who try to write something of quality and try to load the mental model of what they're building into their heads, companies will just hire a lot of prompt engineers and, who knows, generate 5 versions of the application and A/B test them all across their users.

Tom MacWright

# 12th August 2024, 8:17 pm / ai-assisted-programming, generative-ai, ai, tom-macwright, llms

The first four Val Town runtimes (via) Val Town solves one of my favourite technical problems: how to run untrusted code in a safe sandbox. They're on their fourth iteration of this now, currently using a Node.js application that launches Deno sub-processes using the node-deno-vm npm package and runs code in those, taking advantage of the Deno sandboxing mechanism and terminating processes that take too long in order to protect against while(true) style attacks.

# 8th February 2024, 6:38 pm / nodejs, deno, javascript, sandboxing, tom-macwright, val-town

2022

Playing with ActivityPub (via) Tom MacWright describes his attempts to build the simplest possible ActivityPub publication—for a static site powered by Jekyll, where he used Netlify functions to handle incoming subscriptions (storing them in PlanetScale via their Deno API library) and wrote a script which loops through and notifies all of his subscriptions every time he publishes something new.

# 10th December 2022, 12:58 am / mastodon, activitypub, deno, tom-macwright

Working with the web platform is dealing with history, with the accumulated matter of quirksmode and good-enough standards. In exchange for the ability to deliver instantly-updating software directly to customers with no middlemen and no installation, you have to absorb a great deal of nearly-useless information that’s entirely about dodging meaningless traps.

Tom MacWright

# 4th March 2022, 4:11 pm / web, tom-macwright

lon lat lon lat lon. Tom MacWright’s definitive guide to the (latitude, longitude) v.s. (longitude, latitude) debate. The answer is frustrating: both orders are used by significant software, so there’s no single answer that will satisfy everyone. I’ve recently been mostly convinced over to the longitude, latitude side mainly because that’s a better fit for the non-geospatial x, y pattern.

# 10th February 2022, 4:32 pm / geospatial, tom-macwright

GitHub Burndown (via) Neat Observable notebook by Tom MacWright—give it a GitHub access token and the name of a repo and it pulls the details of every issue and plots a burndown chart over time, showing how long issues stay open for. The code is worth spending some time with—the way it fetches data from the paginated JSON API is a really great example of using generators with Observable, and the chart itself is a lovely clear example of Observable Plot.

# 10th February 2022, 4:29 pm / observable, tom-macwright, github, observable-plot

2021

Serving map tiles from SQLite with MBTiles and datasette-tiles

Visit Serving map tiles from SQLite with MBTiles and datasette-tiles

Working on datasette-leaflet last week re-kindled my interest in using Datasette as a GIS (Geographic Information System) platform. SQLite already has strong GIS functionality in the form of SpatiaLite and datasette-cluster-map is currently the most downloaded plugin. Most importantly, maps are fun!

[... 1,334 words]

2020

And for what? Again - there is a swath of use cases which would be hard without React and which aren’t complicated enough to push beyond React’s limits. But there are also a lot of problems for which I can’t see any concrete benefit to using React. Those are things like blogs, shopping-cart-websites, mostly-CRUD-and-forms-websites. For these things, all of the fancy optimizations are optimizations to get you closer to the performance you would’ve gotten if you just hadn’t used so much technology.

Tom MacWright

# 11th May 2020, 12:03 am / react, tom-macwright

Things I learned about shapefiles building shapefile-to-sqlite

Visit Things I learned about shapefiles building shapefile-to-sqlite

The latest in my series of x-to-sqlite tools is shapefile-to-sqlite. I learned a whole bunch of things about the ESRI shapefile format while building it.

[... 1,073 words]

2019

togeojson (via) Handy JavaScript library and command-mine tool for converting KML and GPX to GeoJSON, by Tom MacWright

# 18th January 2019, 11:50 pm / geo, kml, geojson, tom-macwright

2018

Observable Beta (via) Observable just released their beta, and it’s quite something. It’s by Mike Bostock (d3), Jeremy Ashkenas (Backbone, CoffeeScript) and Tom MacWright (Mapbox Studio). The easiest way to describe it is Jupyter notebooks for JavaScript supporting reactive programming—so code is evaluated as you type and you can add interactive widgets (like sliders and canvas views) to construct explorable visualizations on the fly.

# 31st January 2018, 4:46 pm / jupyter, d3, javascript, observable, jeremy-ashkenas, mike-bostock, tom-macwright