Simon Willison’s Weblog

Subscribe

31 items tagged “c”

2024

llm.c (via) Andrej Karpathy implements LLM training—initially for GPT-2, other architectures to follow—in just over 1,000 lines of C on top of CUDA. Includes a tutorial about implementing LayerNorm by porting an implementation from Python. # 9th April 2024, 3:24 pm

Hello World (via) Lennon McLean dives deep down the rabbit hole of what happens when you execute the binary compiled from “Hello world” in C on a Linux system, digging into the details of ELF executables, objdump disassembly, the C standard library, stack frames, null-terminated strings and taking a detour through musl because it’s easier to read than Glibc. # 9th April 2024, 1:06 am

Building and testing C extensions for SQLite with ChatGPT Code Interpreter

I wrote yesterday about how I used Claude and ChatGPT Code Interpreter for simple ad-hoc side quests—in that case, for converting a shapefile to GeoJSON and merging it into a single polygon.

[... 4612 words]

Beej’s Guide to Networking Concepts (via) Beej’s Guide to Network Programming is a legendary tutorial on network programming in C, continually authored and updated by Brian “Beej” Hall since 1995.

This is NOT that. Beej’s Guide to Networking Concepts is brand new—started in March 2023—and illustrates a whole bunch of networking concepts using Python instead of C.

From the forward: “Is it Beej’s Guide to Network Programming in Python? Well, kinda, actually. The C book is more about how C’s (well, Unix’s) network API works. And this book is more about the concepts underlying it, using Python as a vehicle.” # 30th January 2024, 10:08 pm

Find a level of abstraction that works for what you need to do. When you have trouble there, look beneath that abstraction. You won’t be seeing how things really work, you’ll be seeing a lower-level abstraction that could be helpful. Sometimes what you need will be an abstraction one level up. Is your Python loop too slow? Perhaps you need a C loop. Or perhaps you need numpy array operations.

You (probably) don’t need to learn C.

Ned Batchelder # 24th January 2024, 6:25 pm

2023

jo (via) Neat little C utility (available via brew/apt-get install etc) for conveniently outputting JSON from a shell: “jo -p name=jo n=17 parser=false” will output a JSON object with string, integer and boolean values, and you can nest it to create nested objects. Looks very handy. # 8th October 2023, 5:20 am

TG: Polygon indexing (via) TG is a brand new geospatial library by Josh Baker, author of the Tile38 in-memory spatial server (kind of a geospatial Redis). TG is written in pure C and delivered as a single C file, reminiscent of the SQLite amalgamation.

TG looks really interesting. It implements almost the exact subset of geospatial functionality that I find most useful: point-in-polygon, intersect, WKT, WKB, and GeoJSON—all with no additional dependencies.

The most interesting thing about it is the way it handles indexing. In this documentation Josh describes two approaches he uses to speeding up point-in-polygon and intersection using a novel approach that goes beyond the usual RTree implementation.

I think this could make the basis of a really useful SQLite extension—a lighter-weight alternative to SpatiaLite. # 23rd September 2023, 4:32 am

Dynamic linker tricks: Using LD_PRELOAD to cheat, inject features and investigate programs (via) This tutorial by Rafał Cieślak from 2013 filled in a bunch of gaps in my knowledge about how C works on Linux. # 8th September 2023, 10:05 pm

2022

redbean (via) “redbean makes it possible to share web applications that run offline as a single-file αcτµαlly pδrταblε εxεcµταblε zip archive which contains your assets. All you need to do is download the redbean.com program below, change the filename to .zip, add your content in a zip editing tool, and then change the extension back to .com”.

redbean is implemented as a single C file with a dazzling array of clever tricks—most impressively, the single executable works on Linux, macOS, Windows and various BSDs!

It embeds Lua, and in June last year added SQLite too—so self-contained distributable web applications built with Redbean can now use Lua and SQLite for dynamic scripting. Performance sounds incredible: “redbean can serve 1 million+ gzip encoded responses per second on a cheap personal computer”. # 17th February 2022, 6:01 am

Running C unit tests with pytest (via) Brilliant, detailed tutorial by Gabriele Tornetta on testing C code using pytest, which also doubles up as a ctypes tutorial. There’s a lot of depth here—in addition to exercising C code through ctypes, Gabriele shows how to run each test in a separate process so that segmentation faults don’t fail the entire suite, then adds code to run the compiler as part of the pytest run, and then shows how to use gdb trickery to generate more useful stack traces. # 12th February 2022, 5:14 pm

Mypyc (via) Spotted this in the Black release notes: “Black is now compiled with mypyc for an overall 2x speed-up”. Mypyc is a tool that compiles Python modules (written in a subset of Python) to C extensions—similar to Cython but using just Python syntax, taking advantage of type annotations to perform type checking and type inference. It’s part of the mypy type checking project, which has been using it since 2019 to gain a 4x performance improvement over regular Python. # 30th January 2022, 1:31 am

2021

How to look at the stack with gdb. Useful short tutorial on gdb from first principles. # 24th May 2021, 6:23 pm

cosmopolitan libc (via) “Cosmopolitan makes C a build-once run-anywhere language, similar to Java, except it doesn’t require interpreters or virtual machines be installed beforehand. [...] Instead, it reconfigures stock GCC to output a POSIX-approved polyglot format that runs natively on Linux + Mac + Windows + FreeBSD + OpenBSD + BIOS with the best possible performance and the tiniest footprint imaginable.” This is a spectacular piece of engineering. # 27th February 2021, 6:02 am

2020

Unravelling `not` in Python (via) Part of a series where Brett Cannon looks at how fundamental Python syntactic sugar works, including a clearly explained dive into the underlying op codes and C implementation. # 27th November 2020, 5:59 pm

CG-SQL (via) This is the toolkit the Facebook Messenger team wrote to bring stored procedures to SQLite. It implements a custom version of the T-SQL language which it uses to generate C code that can then be compiled into a SQLite module. # 22nd October 2020, 6:25 pm

Pikchr. Interesting new project from SQLite creator D. Richard Hipp. Pikchr is a new mini language for describing visual diagrams, designed to be embedded in Markdown documentation. It’s already enabled for the SQLite forum. Implementation is a no-dependencies C library and output is SVG. # 21st October 2020, 4:02 pm

A Compiler Writing Journey (via) Warren Toomey has been writing a self-compiling compiler for a subset of C, and extensively documenting every step of the journey here on GitHub. The result is an extremely high quality free textbook on compiler construction. # 8th January 2020, 3:33 am

2019

Calling C functions from BigQuery with web assembly (via) Google BigQuery lets you define custom SQL functions in JavaScript, and it turns out they expose the WebAssembly.instantiate family of APIs. Which means you can write your UDD in C or Rust, compile it to WebAssembly and run it as part of your query! # 27th October 2019, 5:55 am

2017

A Regular Expression Matcher: Code by Rob Pike, Exegesis by Brian Kernighan (via) Delightfully clear and succinct 30-line C implementation of a regular expression matcher that supports $, ^, . and * operations. # 5th December 2017, 6:36 pm

C is a bit like Latin these days. We no longer write everything in it, but knowing it affords deeper knowledge of more-recent languages.

Norman Wilson # 8th October 2017, 4:03 pm

2012

Do people still write and compile programs from the command line, instead of an IDE? Why or why not?

Being an expert with command line tools gives you super powers.

[... 94 words]

2010

Building a GeoIP server with ZeroMQ. ZeroMQ makes it trivially easy to write a network service in raw C that makes functionality from a C library (in this case the MaxMind GeoIP library) available to clients written in many different client languages. # 9th November 2010, 9:36 am

Mongrel2 is “Self-Hosting”. Zed Shaw’s Mongrel2 is shaping up to be a really interesting project. “A web server simply written in C that loves all languages equally”, the two most interesting new ideas are the ability to handle HTTP, Flash Sockets and WebSockets all on the same port (thanks to an extension to the Mongrel HTTP parser that can identify all three protocols) and the ability to hook Mongrel2 up to the backend servers using either TCP/IP or ZeroMQ. I’m guessing this means Mongrel2 could hold an HTTP request open, fire off some messages and wait for various backends to send messages back to construct the response, making async processing just as easy as a regular blocking request/response cycle. # 17th June 2010, 8:11 pm

Redis weekly update #3—Pub/Sub and more. Redis is now a publish/subscribe server—and it ended up only taking 150 lines of C code since Redis internals were already based on that paradigm. # 30th March 2010, 3:15 pm

2009

I think that what’s particularly hard with C is not the details about pointers, automatic memory management, and so forth, but the fact that C is at the same time so low level and so flexible. So basically if you want to create a large project in C you have to build a number of intermediate layers (otherwise the code will be a complete mess full of bugs and 10 times bigger than required). This continue design exercise of creating additional layers is the hard part about C. You have to get very good at understanding when to write a function or not, when to create a layer of abstraction, and when it’s worth to generalize or when it is an overkill.

Salvatore Sanfilippo # 18th December 2009, 3:50 pm

Mac OS X 10.6 Snow Leopard: the Ars Technica review. The essential review: 23 pages of information-dense but readable goodness. Pretty much everything I know about Mac OS X internals I learnt from reading John Siracusa’s reviews—this one is particularly juice when it gets to Grand Central Dispatch and blocks (aka closures) in C and Objective-C. # 1st September 2009, 7:05 pm

The compiler only pays attention to the semicolons and braces while ignoring the line breaks and indentation, but humans usually only pay attention to the line breaks and indentation while ignoring the semicolons and braces. This gives the code the opportunity to lie about what it’s really doing. Consequently we need to take extra care when writing in C, Java, C++, C# etc.

Elliotte Rusty Harold # 2nd January 2009, 10:26 am

2008

Running C and Python Code on The Web. Adobe are working on a toolchain to compile C code to target the Tamarin VM in Flash. This will allow existing C code (from CPython to Quake) to execute in a safe sandbox in the browser. # 4th July 2008, 8:26 am

2004

Ned Batchelder: Showing C header structure. Using Python to maked other languages less painful # 4th February 2004, 1:19 am

2003

Jonathan Caves: Adventures in Visual C (via) Who’d have thought Microsoft would have all the best in-house bloggers? # 23rd November 2003, 11:59 pm