Code generation vs data driven programming
Via Ned Batchelder, this interview with pragmatic Dave Thomas on code generation closely reflects my own nascent thoughts on the issue:
CGN: What do think the future is for code generation?
Dave: I think that in the long term the larger code generation efforts, the “application generators,” will become a thing of the past. They are there because the underlying technologies and architectures don’t yet support programming at a high level. But I’m betting that languages such as Java and C++ will in the long term be seen as a curious branch in the evolution of computing. I’m hoping that somewhere out there some bright spark is coming up with a way of letting us write applications expressively and dynamically. Once this happens, the need for these kinds of code generators will diminish.
For example, I rarely (if ever) write a code generator that generates Ruby code: there’s just no need, as Ruby is dynamic enough to let be do what I want without leaving the language.
In the shorter term, though, I think code generators of all kinds will continue to contribute significantly to the industry. Java and C# are both such stifling languages that you need to be able to use code generators to make them effective.
We considered using code generators for our current major project at work, and picked up Jack Herrington’s book on the subject. Reading through it, it became clear that many of the problems that code generators solve can be tackled instead using data driven programming techniques made possible by dynamic languages. Since we had already settled on Python as our implementation language the need for code generation became far less apparent, and we ended up avoiding it entirely with the exception of a command line tool for passvely generating basic templates for our admin interface.
If I ever have to work with a less expressive language I’ll certainly consider using a code generator (probably written in Python) to abstract away some some of the tedious repetition. As it is, Python’s rich data structures and clean support for introspection provide an excellent alternative.
More recent articles
- Weeknotes: Parquet in Datasette Lite, various talks, more LLM hacking - 4th June 2023
- It's infuriatingly hard to understand how closed models train on their input - 4th June 2023
- ChatGPT should include inline tips - 30th May 2023
- Lawyer cites fake cases invented by ChatGPT, judge is not amused - 27th May 2023
- llm, ttok and strip-tags - CLI tools for working with ChatGPT and other LLMs - 18th May 2023
- Delimiters won't save you from prompt injection - 11th May 2023
- Weeknotes: sqlite-utils 3.31, download-esm, Python in a sandbox - 10th May 2023
- Leaked Google document: "We Have No Moat, And Neither Does OpenAI" - 4th May 2023
- Midjourney 5.1 - 4th May 2023
- Prompt injection explained, with video, slides, and a transcript - 2nd May 2023