<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: design-patterns</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/design-patterns.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-02-23T17:43:02+00:00</updated><author><name>Simon Willison</name></author><entry><title>Writing about Agentic Engineering Patterns</title><link href="https://simonwillison.net/2026/Feb/23/agentic-engineering-patterns/#atom-tag" rel="alternate"/><published>2026-02-23T17:43:02+00:00</published><updated>2026-02-23T17:43:02+00:00</updated><id>https://simonwillison.net/2026/Feb/23/agentic-engineering-patterns/#atom-tag</id><summary type="html">
    &lt;p&gt;I've started a new project to collect and document &lt;strong&gt;&lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/"&gt;Agentic Engineering Patterns&lt;/a&gt;&lt;/strong&gt; - coding practices and patterns to help get the best results out of this new era of coding agent development we find ourselves entering.&lt;/p&gt;
&lt;p&gt;I'm using &lt;strong&gt;Agentic Engineering&lt;/strong&gt; to refer to building software using coding agents - tools like Claude Code and OpenAI Codex, where the defining feature is that they can both generate and &lt;em&gt;execute&lt;/em&gt; code - allowing them to test that code and iterate on it independently of turn-by-turn guidance from their human supervisor.&lt;/p&gt;
&lt;p&gt;I think of &lt;strong&gt;vibe coding&lt;/strong&gt; using its &lt;a href="https://simonwillison.net/2025/Mar/19/vibe-coding/"&gt;original definition&lt;/a&gt; of coding where you pay no attention to the code at all, which today is often associated with non-programmers using LLMs to write code.&lt;/p&gt;
&lt;p&gt;Agentic Engineering represents the other end of the scale: professional software engineers using coding agents to improve and accelerate their work by amplifying their existing expertise.&lt;/p&gt;
&lt;p&gt;There is so much to learn and explore about this new discipline! I've already published a lot &lt;a href="https://simonwillison.net/tags/ai-assisted-programming/"&gt;under my ai-assisted-programming tag&lt;/a&gt; (345 posts and counting) but that's been relatively unstructured. My new goal is to produce something that helps answer the question "how do I get good results out of this stuff" all in one place.&lt;/p&gt;
&lt;p&gt;I'll be developing and growing this project here on my blog as a series of chapter-shaped patterns, loosely inspired by the format popularized by &lt;a href="https://en.wikipedia.org/wiki/Design_Patterns"&gt;Design Patterns: Elements of Reusable Object-Oriented Software&lt;/a&gt; back in 1994.&lt;/p&gt;
&lt;p&gt;I published the first two chapters today:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/code-is-cheap/"&gt;Writing code is cheap now&lt;/a&gt;&lt;/strong&gt; talks about the central challenge of agentic engineering: the cost to churn out initial working code has dropped to almost nothing, how does that impact our existing intuitions about how we work, both individually and as a team?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/red-green-tdd/"&gt;Red/green TDD&lt;/a&gt;&lt;/strong&gt; describes how test-first development helps agents write more succinct and reliable code with minimal extra prompting.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I hope to add more chapters at a rate of 1-2 a week. I don't really know when I'll stop, there's a lot to cover!&lt;/p&gt;
&lt;h4 id="written-by-me-not-by-an-llm"&gt;Written by me, not by an LLM&lt;/h4&gt;
&lt;p&gt;I have a strong personal policy of not publishing AI-generated writing under my own name. That policy will hold true for Agentic Engineering Patterns as well. I'll be using LLMs for proofreading and fleshing out example code and all manner of other side-tasks, but the words you read here will be my own.&lt;/p&gt;
&lt;h4 id="chapters-and-guides"&gt;Chapters and Guides&lt;/h4&gt;
&lt;p&gt;Agentic Engineering Patterns isn't exactly &lt;em&gt;a book&lt;/em&gt;, but it's kind of book-shaped. I'll be publishing it on my site using a new shape of content I'm calling a &lt;em&gt;guide&lt;/em&gt;. A guide is a collection of chapters, where each chapter is effectively a blog post with a less prominent date that's designed to be updated over time, not frozen at the point of first publication.&lt;/p&gt;
&lt;p&gt;Guides and chapters are my answer to the challenge of publishing "evergreen" content on a blog. I've been trying to find a way to do this for a while now. This feels like a format that might stick.&lt;/p&gt;

&lt;p&gt;If you're interested in the implementation you can find the code in the &lt;a href="https://github.com/simonw/simonwillisonblog/blob/b9cd41a0ac4a232b2a6c90ca3fff9ae465263b02/blog/models.py#L262-L280"&gt;Guide&lt;/a&gt;, &lt;a href="https://github.com/simonw/simonwillisonblog/blob/b9cd41a0ac4a232b2a6c90ca3fff9ae465263b02/blog/models.py#L349-L405"&gt;Chapter&lt;/a&gt; and &lt;a href="https://github.com/simonw/simonwillisonblog/blob/b9cd41a0ac4a232b2a6c90ca3fff9ae465263b02/blog/models.py#L408-L423"&gt;ChapterChange&lt;/a&gt; models and the &lt;a href="https://github.com/simonw/simonwillisonblog/blob/b9cd41a0ac4a232b2a6c90ca3fff9ae465263b02/blog/views.py#L775-L923"&gt;associated Django views&lt;/a&gt;, almost all of which was written by Claude Opus 4.6 running in Claude Code for web accessed via my iPhone.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/blogging"&gt;blogging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/writing"&gt;writing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="blogging"/><category term="design-patterns"/><category term="projects"/><category term="writing"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="vibe-coding"/><category term="coding-agents"/><category term="agentic-engineering"/></entry><entry><title>Enough AI copilots! We need AI HUDs</title><link href="https://simonwillison.net/2025/Jul/27/enough-ai-copilots-we-need-ai-huds/#atom-tag" rel="alternate"/><published>2025-07-27T22:15:55+00:00</published><updated>2025-07-27T22:15:55+00:00</updated><id>https://simonwillison.net/2025/Jul/27/enough-ai-copilots-we-need-ai-huds/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.geoffreylitt.com/2025/07/27/enough-ai-copilots-we-need-ai-huds"&gt;Enough AI copilots! We need AI HUDs&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Geoffrey Litt compares Copilots - AI assistants that you engage in dialog with and work with you to complete a task - with HUDs, Head-Up Displays, which enhance your working environment in less intrusive ways.&lt;/p&gt;
&lt;p&gt;He uses spellcheck as an obvious example, providing underlines for incorrectly spelt words, and then suggests his &lt;a href="https://www.geoffreylitt.com/2024/12/22/making-programming-more-fun-with-an-ai-generated-debugger"&gt;AI-implemented custom debugging UI&lt;/a&gt; as a more ambitious implementation of that pattern.&lt;/p&gt;
&lt;p&gt;Plenty of people have expressed interest in LLM-backed interfaces that go beyond chat or editor autocomplete. I think HUDs offer a really interesting way to frame one approach to that design challenge.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design"&gt;design&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geoffrey-litt"&gt;geoffrey-litt&lt;/a&gt;&lt;/p&gt;



</summary><category term="design"/><category term="design-patterns"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="geoffrey-litt"/></entry><entry><title>Design Patterns for Securing LLM Agents against Prompt Injections</title><link href="https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#atom-tag" rel="alternate"/><published>2025-06-13T13:26:43+00:00</published><updated>2025-06-13T13:26:43+00:00</updated><id>https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#atom-tag</id><summary type="html">
    &lt;p&gt;This &lt;a href="https://arxiv.org/abs/2506.08837"&gt;new paper&lt;/a&gt; by 11 authors from organizations including IBM, Invariant Labs, ETH Zurich, Google and Microsoft is an &lt;em&gt;excellent&lt;/em&gt; addition to the literature on prompt injection and LLM security.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In this work, we describe a number of &lt;strong&gt;design patterns&lt;/strong&gt; for LLM agents that significantly mitigate the risk of prompt injections. These design patterns constrain the actions of agents to explicitly prevent them from solving &lt;em&gt;arbitrary&lt;/em&gt; tasks. We believe these design patterns offer a valuable trade-off between agent utility and security.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's the full citation: &lt;strong&gt;&lt;a href="https://arxiv.org/abs/2506.08837"&gt;Design Patterns for Securing LLM Agents against Prompt Injections&lt;/a&gt;&lt;/strong&gt; (2025) by Luca Beurer-Kellner, Beat Buesser, Ana-Maria Creţu, Edoardo Debenedetti, Daniel Dobos, Daniel Fabian, Marc Fischer, David Froelicher, Kathrin Grosse, Daniel Naeff, Ezinwanne Ozoani, Andrew Paverd, Florian Tramèr, and Václav Volhejn.&lt;/p&gt;
&lt;p&gt;I'm so excited to see papers like this starting to appear. I &lt;a href="https://simonwillison.net/2025/Apr/11/camel/"&gt;wrote about&lt;/a&gt; Google DeepMind's &lt;strong&gt;Defeating Prompt Injections by Design&lt;/strong&gt; paper (aka the CaMeL paper) back in April, which was the first paper I'd seen that proposed a credible solution to some of the challenges posed by prompt injection against tool-using LLM systems (often referred to as "agents").&lt;/p&gt;
&lt;p&gt;This new paper provides a robust explanation of prompt injection, then proposes six design patterns to help protect against it, including the pattern proposed by the CaMeL paper.&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#scope-of-the-problem"&gt;The scope of the problem&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#the-action-selector-pattern"&gt;The Action-Selector Pattern&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#the-plan-then-execute-pattern"&gt;The Plan-Then-Execute Pattern&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#the-llm-map-reduce-pattern"&gt;The LLM Map-Reduce Pattern&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#the-dual-llm-pattern"&gt;The Dual LLM Pattern&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#the-code-then-execute-pattern"&gt;The Code-Then-Execute Pattern&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#the-context-minimization-pattern"&gt;The Context-Minimization pattern&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#the-case-studies"&gt;The case studies&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jun/13/prompt-injection-design-patterns/#closing-thoughts"&gt;Closing thoughts&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="scope-of-the-problem"&gt;The scope of the problem&lt;/h4&gt;
&lt;p&gt;The authors of this paper &lt;em&gt;very clearly&lt;/em&gt; understand the scope of the problem:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As long as both agents and their defenses rely on the current class of language models, &lt;strong&gt;we believe it is unlikely that general-purpose agents can provide meaningful and reliable safety guarantees&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;This leads to a more productive question: &lt;strong&gt;what kinds of agents can we build &lt;em&gt;today&lt;/em&gt; that produce useful work while offering resistance to prompt injection attacks?&lt;/strong&gt; In this section, we introduce a set of design patterns for LLM agents that aim to mitigate — if not entirely eliminate — the risk of prompt injection attacks. These patterns impose intentional constraints on agents, explicitly limiting their ability to perform &lt;em&gt;arbitrary&lt;/em&gt; tasks.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is a very realistic approach. We don't have a magic solution to prompt injection, so we need to make trade-offs. The trade-off they make here is "limiting the ability of agents to perform arbitrary tasks". That's not a popular trade-off, but it gives this paper a lot of credibility in my eye.&lt;/p&gt;
&lt;p&gt;This paragraph proves that they fully get it (emphasis mine):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The design patterns we propose share a common guiding principle: &lt;strong&gt;once an LLM agent has ingested untrusted input, it must be constrained so that it is &lt;em&gt;impossible&lt;/em&gt; for that input to trigger any consequential actions&lt;/strong&gt;—that is, actions with negative side effects on the system or its environment. At a minimum, this means that restricted agents must not be able to invoke tools that can break the integrity or confidentiality of the system. Furthermore, their outputs should not pose downstream risks — such as exfiltrating sensitive information (e.g., via embedded links) or manipulating future agent behavior (e.g., harmful responses to a user query).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The way I think about this is that any exposure to potentially malicious tokens entirely taints the output for that prompt. Any attacker who can sneak in their tokens should be considered to have complete control over what happens next - which means they control not just the textual output of the LLM but also any tool calls that the LLM might be able to invoke.&lt;/p&gt;
&lt;p&gt;Let's talk about their design patterns.&lt;/p&gt;
&lt;h4 id="the-action-selector-pattern"&gt;The Action-Selector Pattern&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;A relatively simple pattern that makes agents immune to prompt injections — while still allowing them to take external actions — is to prevent any feedback from these actions back into the agent.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Agents can trigger tools, but cannot be exposed to or act on the responses &lt;em&gt;from&lt;/em&gt; those tools. You can't read an email or retrieve a web page, but you can trigger actions such as "send the user to this web page" or "display this message to the user".&lt;/p&gt;
&lt;p&gt;They summarize this pattern as an "LLM-modulated switch statement", which feels accurate to me.&lt;/p&gt;
&lt;h4 id="the-plan-then-execute-pattern"&gt;The Plan-Then-Execute Pattern&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;A more permissive approach is to allow feedback from tool outputs back to the agent, but to prevent the tool outputs from &lt;em&gt;influencing&lt;/em&gt; the choice of actions taken by the agent.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The idea here is to plan the tool calls in advance before any chance of exposure to untrusted content. This allows for more sophisticated sequences of actions, without the risk that one of those actions might introduce malicious instructions that then trigger unplanned harmful actions later on.&lt;/p&gt;
&lt;p&gt;Their example converts "send today’s schedule to my boss John Doe" into a &lt;code&gt;calendar.read()&lt;/code&gt; tool call followed by an &lt;code&gt;email.write(..., 'john.doe@company.com')&lt;/code&gt;. The &lt;code&gt;calendar.read()&lt;/code&gt; output might be able to corrupt the body of the email that is sent, but it won't be able to change the recipient of that email.&lt;/p&gt;
&lt;h4 id="the-llm-map-reduce-pattern"&gt;The LLM Map-Reduce Pattern&lt;/h4&gt;
&lt;p&gt;The previous pattern still enabled malicious instructions to affect the &lt;em&gt;content&lt;/em&gt; sent to the next step. The Map-Reduce pattern involves sub-agents that are directed by the co-ordinator, exposed to untrusted content and have their results safely aggregated later on.&lt;/p&gt;
&lt;p&gt;In their example an agent is asked to find files containing this month's invoices and send them to the accounting department. Each file is processed by a sub-agent that responds with a boolean indicating whether the file is relevant or not. Files that were judged relevant are then aggregated and sent.&lt;/p&gt;
&lt;p&gt;They call this the map-reduce pattern because it reflects the classic map-reduce framework for distributed computation.&lt;/p&gt;
&lt;h4 id="the-dual-llm-pattern"&gt;The Dual LLM Pattern&lt;/h4&gt;
&lt;p&gt;I get a citation here! I described the &lt;a href="https://simonwillison.net/2023/Apr/25/dual-llm-pattern/"&gt;The Dual LLM pattern for building AI assistants that can resist prompt injection&lt;/a&gt; back in April 2023, and it influenced &lt;a href="https://simonwillison.net/2025/Apr/11/camel/"&gt;the CaMeL paper&lt;/a&gt; as well.&lt;/p&gt;
&lt;p&gt;They describe my exact pattern, and even illustrate it with this diagram:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/dual-llm-illustration.png" alt="Diagram showing AI system architecture with user on left sending prompt to privileged LLM (represented by neural network), which connects to tools (hammer and wrench icon) and quarantined LLM (shown in sandboxed environment with brick wall pattern), with symbolic memory showing variables $VAR1 = res1, $VAR2 = res2, ... $VARn = resn, and arrows showing flow back to &amp;quot;Return response to user&amp;quot;" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The key idea here is that a privileged LLM co-ordinates a quarantined LLM, avoiding &lt;em&gt;any&lt;/em&gt; exposure to untrusted content. The quarantined LLM returns symbolic variables - &lt;code&gt;$VAR1&lt;/code&gt; representing a summarized web page for example - which the privileged LLM can request are shown to the user without being exposed to that tainted content itself.&lt;/p&gt;
&lt;h4 id="the-code-then-execute-pattern"&gt;The Code-Then-Execute Pattern&lt;/h4&gt;
&lt;p&gt;This is the pattern described by &lt;a href="https://simonwillison.net/2025/Apr/11/camel/"&gt;DeepMind's CaMeL paper&lt;/a&gt;. It's an improved version of my dual LLM pattern, where the privileged LLM generates code in a custom sandboxed DSL that specifies which tools should be called and how their outputs should be passed to each other.&lt;/p&gt;
&lt;p&gt;The DSL is designed to enable full data flow analysis, such that any tainted data can be marked as such and tracked through the entire process.&lt;/p&gt;
&lt;h4 id="the-context-minimization-pattern"&gt;The Context-Minimization pattern&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;To prevent certain user prompt injections, the agent system can remove unnecessary content from the context over multiple interactions.&lt;/p&gt;
&lt;p&gt;For example, suppose that a malicious user asks a customer service chatbot for a quote on a new car and tries to prompt inject the agent to give a large discount. The system could ensure that the agent first translates the user’s request into a database query (e.g., to find the latest offers). Then, before returning the results to the customer, the user’s prompt is removed from the context, thereby preventing the prompt injection.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I'm slightly confused by this one, but I think I understand what it's saying. If a user's prompt is converted into a SQL query which returns raw data from a database, and that data is returned in a way that cannot possibly include any of the text from the original prompt, any chance of a prompt injection sneaking through should be eliminated.&lt;/p&gt;
&lt;h4 id="the-case-studies"&gt;The case studies&lt;/h4&gt;
&lt;p&gt;The rest of the paper presents ten case studies to illustrate how thes design patterns can be applied in practice, each accompanied by detailed threat models and potential mitigation strategies.&lt;/p&gt;
&lt;p&gt;Most of these are extremely practical and detailed. The &lt;strong&gt;SQL Agent&lt;/strong&gt; case study, for example, involves an LLM with tools for accessing SQL databases and writing and executing Python code to help with the analysis of that data. This is a &lt;em&gt;highly&lt;/em&gt; challenging environment for prompt injection, and the paper spends three pages exploring patterns for building this in a responsible way.&lt;/p&gt;
&lt;p&gt;Here's the full list of case studies. It's worth spending time with any that correspond to work that you are doing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;OS Assistant&lt;/li&gt;
&lt;li&gt;SQL Agent&lt;/li&gt;
&lt;li&gt;Email &amp;amp; Calendar Assistant&lt;/li&gt;
&lt;li&gt;Customer Service Chatbot&lt;/li&gt;
&lt;li&gt;Booking Assistant&lt;/li&gt;
&lt;li&gt;Product Recommender&lt;/li&gt;
&lt;li&gt;Resume Screening Assistant&lt;/li&gt;
&lt;li&gt;Medication Leaflet Chatbot&lt;/li&gt;
&lt;li&gt;Medical Diagnosis Chatbot&lt;/li&gt;
&lt;li&gt;Software Engineering Agent&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here's an interesting suggestion from that last &lt;strong&gt;Software Engineering Agent&lt;/strong&gt; case study on how to safely consume API information from untrusted external documentation:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The safest design we can consider here is one where the code agent only interacts with untrusted documentation or code by means of a strictly formatted interface (e.g., instead of seeing arbitrary code or documentation, the agent only sees a formal API description). This can be achieved by processing untrusted data with a quarantined LLM that is instructed to convert the data into an API description with strict formatting requirements to minimize the risk of prompt injections (e.g., method names limited to 30 characters).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Utility&lt;/em&gt;: Utility is reduced because the agent can only see APIs and no natural language descriptions or examples of third-party code.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Security&lt;/em&gt;: Prompt injections would have to survive being formatted into an API description, which is unlikely if the formatting requirements are strict enough.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I wonder if it is indeed safe to allow up to 30 character method names... it could be that a truly creative attacker could come up with a method name like &lt;code&gt;run_rm_dash_rf_for_compliance()&lt;/code&gt; that causes havoc even given those constraints.&lt;/p&gt;
&lt;h4 id="closing-thoughts"&gt;Closing thoughts&lt;/h4&gt;
&lt;p&gt;I've been &lt;a href="https://simonwillison.net/tags/prompt-injection/"&gt;writing about prompt injection&lt;/a&gt; for nearly three years now, but I've never had the patience to try and produce a formal paper on the subject. It's a huge relief to see papers of this quality start to emerge.&lt;/p&gt;
&lt;p&gt;Prompt injection remains the biggest challenge to responsibly deploying the kind of agentic systems everyone is so excited to build. The more attention this family of problems gets from the research community the better.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-injection"&gt;prompt-injection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/exfiltration-attacks"&gt;exfiltration-attacks&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/paper-review"&gt;paper-review&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="design-patterns"/><category term="security"/><category term="ai"/><category term="prompt-injection"/><category term="generative-ai"/><category term="llms"/><category term="exfiltration-attacks"/><category term="ai-agents"/><category term="paper-review"/></entry><entry><title>Malleable software</title><link href="https://simonwillison.net/2025/Jun/11/malleable-software/#atom-tag" rel="alternate"/><published>2025-06-11T19:21:39+00:00</published><updated>2025-06-11T19:21:39+00:00</updated><id>https://simonwillison.net/2025/Jun/11/malleable-software/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.inkandswitch.com/essay/malleable-software/"&gt;Malleable software&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New, delightful manifesto from Ink &amp;amp; Switch.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In this essay, we envision malleable software: tools that users can reshape with minimal friction to suit their unique needs. Modification becomes routine, not exceptional. Adaptation happens at the point of use, not through engineering teams at distant corporations.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is a beautifully written essay. I love the early framing of a comparison with physical environments such as the workshop of a luthier:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A guitar maker sets up their workshop with their saws, hammers, chisels and files arranged just so. They can also build new tools as needed to achieve the best result—a wooden block as a support, or a pair of pliers sanded down into the right shape. […] &lt;strong&gt;In the physical world, the act of crafting our environments comes naturally, because physical reality is malleable&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Most software doesn’t have these qualities, or requires deep programming skills in order to make customizations. The authors propose “malleable software” as a new form of computing ecosystem to “give users agency as co-creators”.&lt;/p&gt;
&lt;p&gt;They mention plugin systems as one potential path, but highlight their failings:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;However, &lt;strong&gt;plugin systems still can only edit an app's behavior in specific authorized ways.&lt;/strong&gt; If there's not a plugin surface available for a given customization, the user is out of luck. (In fact, most applications have no plugin API at all, because it's hard work to design a good one!)&lt;/p&gt;
&lt;p&gt;There are other problems too. Going from installing plugins to &lt;em&gt;making&lt;/em&gt; one is a chasm that's hard to cross. And each app has its own distinct plugin system, making it typically impossible to share plugins across different apps.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Does AI-assisted coding help? Yes, to a certain extent, but there are still barriers that we need to tear down:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We think these developments hold exciting potential, and represent a good reason to pursue malleable software at this moment. But at the same time, &lt;strong&gt;AI code generation alone does not address all the barriers to malleability.&lt;/strong&gt; Even if we presume that every computer user could perfectly write and edit code, that still leaves open some big questions.&lt;/p&gt;
&lt;p&gt;How can users tweak the &lt;em&gt;existing&lt;/em&gt; tools they've installed, rather than just making new siloed applications? How can AI-generated tools compose with one another to build up larger workflows over shared data? And how can we let users take more direct, precise control over tweaking their software, without needing to resort to AI coding for even the tiniest change?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They describe three key design patterns: a gentle slope from user to creator (as seen in Excel and HyperCard), focusing on tools, not apps (a kitchen knife, not an avocado slicer) and encouraging communal creation.&lt;/p&gt;
&lt;p&gt;I found this note inspiring when considering my own work on &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Many successful customizable systems such as spreadsheets, HyperCard, Flash, Notion, and Airtable follow a similar pattern: &lt;strong&gt;a media editor with optional programmability.&lt;/strong&gt; When an environment offers document editing with familiar direct manipulation interactions, users can get a lot done without needing to write any code.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The remainder of the essay focuses on Ink &amp;amp; Switch's own prototypes in this area, including Patchwork, Potluck and Embark.&lt;/p&gt;
&lt;p&gt;Honestly, this is one of those pieces that defies attempts to summarize it. It's worth carving out some quality time to spend with this.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/fkgmer/malleable_software_restoring_user"&gt;lobste.rs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-first"&gt;local-first&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geoffrey-litt"&gt;geoffrey-litt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ink-and-switch"&gt;ink-and-switch&lt;/a&gt;&lt;/p&gt;



</summary><category term="design-patterns"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="local-first"/><category term="geoffrey-litt"/><category term="ink-and-switch"/></entry><entry><title>The Baked Data architectural pattern</title><link href="https://simonwillison.net/2021/Jul/28/baked-data/#atom-tag" rel="alternate"/><published>2021-07-28T20:23:44+00:00</published><updated>2021-07-28T20:23:44+00:00</updated><id>https://simonwillison.net/2021/Jul/28/baked-data/#atom-tag</id><summary type="html">
    &lt;p&gt;I've been exploring an architectural pattern for publishing websites over the past few years that I call the "Baked Data" pattern. It provides many of the advantages of static site generators while avoiding most of their limitations. I think it deserves to be used more widely.&lt;/p&gt;
&lt;p&gt;I define the Baked Data architectural pattern as the following:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Baked Data: bundling a read-only copy of your data alongside the code for your application, as part of the same deployment&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Most dynamic websites keep their code and data separate: the code runs on an application server, the data lives independently in some kind of external data store - something like PostgreSQL, MySQL or MongoDB.&lt;/p&gt;
&lt;p&gt;With Baked Data, the data is deployed as part of the application bundle. Any time the content changes, a fresh copy of the site is deployed that includes those updates.&lt;/p&gt;
&lt;p&gt;I mostly use SQLite database files for this, but plenty of other formats can work here too.&lt;/p&gt;
&lt;p&gt;This works particularly well with so-called "serverless" deployment platforms - platforms that support stateless deployments and only charge for resources spent servicing incoming requests ("scale to zero").&lt;/p&gt;
&lt;p&gt;Since every change to the data results in a fresh deployment this pattern doesn't work for sites that change often - but in my experience many content-oriented sites update their content at most a few times a day. Consider blogs, documentation sites, project websites - anything where content is edited by a small group of authors.&lt;/p&gt;
&lt;h4 id="benefits-of-baked-data"&gt;Benefits of Baked Data&lt;/h4&gt;
&lt;p&gt;Why would you want to apply this pattern? A few reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inexpensive to host&lt;/strong&gt;. Anywhere that can run application code can host a Baked Data application - there's no need to pay extra for a managed database system. Scale to zero serverless hosts such as &lt;a href="https://cloud.google.com/run"&gt;Cloud Run&lt;/a&gt;, &lt;a href="https://vercel.com/"&gt;Vercel&lt;/a&gt; or &lt;a href="https://aws.amazon.com/lambda/"&gt;AWS Lambda&lt;/a&gt; will charge only cents per month for low-traffic deployments.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Easy to scale&lt;/strong&gt;. Need to handle more traffic? Run more copies of your application and its bundled data. Horizontally scaling Baked Data applications is trivial. They're also a great fit to run behind a caching proxy CDN such as Cloudflare or Fastly - when you deploy a new version you can purge that entire cache.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Difficult to break&lt;/strong&gt;. Hosting server-side applications on a VPS is always disquieting because there's so much that might go wrong - the server could be compromised, or a rogue log file could cause it to run out of disk space. With Baked Data the worst that can happen is that you need to re-deploy the application - there's no risk at all of data loss, and providers that can auto-restart code can recover from errors automatically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Server-side functionality is supported&lt;/strong&gt;. Static site generators provide many of the above benefits, but with the limitation that any dynamic functionality needs to happen in client-side JavaScript. With a Baked Data application you can execute server-side code too.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Templated pages&lt;/strong&gt;. Another improvement over static site generators: if you have 10,000 pages, a static site generator will need to generate 10,000 HTML files. With Baked Data those 10,000 pages can exist as rows in a single SQLite database file, and the pages can be generated at run-time using a server-side template.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Easy to support multiple formats&lt;/strong&gt;. Since your content is in a dynamic data store, outputting that same content in alternative formats is easy. I use Datasette plugins for this: &lt;a href="https://datasette.io/plugins/datasette-atom"&gt;datasette-atom&lt;/a&gt; can produce an Atom feed from a SQL query, and &lt;a href="https://datasette.io/plugins/datasette-ics"&gt;datasette-ics&lt;/a&gt; does the same thing for iCalendar feeds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrates well with version control&lt;/strong&gt;. I like to keep my site content under version control. The Baked Data pattern works well with build scripts that read content from a git repository and use it to build assets that are bundled with the deployment.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="how-to-bake-your-data"&gt;How to bake your data&lt;/h4&gt;
&lt;p&gt;My initial implementations of Baked Data have all used SQLite. It's an ideal format for this kind of application: a single binary file which can store anything that can be represented as relational tables, &lt;a href="https://www.sqlite.org/json1.html"&gt;JSON documents&lt;/a&gt; or &lt;a href="https://simonwillison.net/2020/Jul/30/fun-binary-data-and-sqlite/"&gt;binary objects&lt;/a&gt; - essentially anything at all.&lt;/p&gt;
&lt;p&gt;Any format that can be read from disk by your dynamic server-side code will work too: YAML or CSV files, Berkeley DB files, or anything else that can be represented by a bucket of read-only bytes in a file on disk.&lt;/p&gt;
&lt;p&gt;[I have a hunch that you could even use something like PostgreSQL, MySQL or Elasticsearch by packaging up their on-disk representations and shipping them as part of a Docker container, but I've not tried that myself yet.]&lt;/p&gt;
&lt;p&gt;Once your data is available in a file, your application code can read from that file and use it to generate and return web pages.&lt;/p&gt;
&lt;p&gt;You can write code that does this in any server-side language. I use Python, usually with my &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; application server which can read from a SQLite database file and use &lt;a href="https://docs.datasette.io/en/stable/custom_templates.html"&gt;Jinja templates&lt;/a&gt; to generate pages.&lt;/p&gt;
&lt;p&gt;The final piece of the puzzle is a build and deploy script. I use GitHub Actions for this, but any CI tool will work well here. The script builds the site content into a deployable asset, then deploys that asset along with the application code to a hosting platform.&lt;/p&gt;
&lt;h4 id="baked-data-datasette-io"&gt;Baked Data in action: datasette.io&lt;/h4&gt;
&lt;p&gt;The most sophisticated Baked Data site I've published myself is the official website for my Datasette project, &lt;a href="https://datasette.io/"&gt;datasette.io&lt;/a&gt; - source code &lt;a href="https://github.com/simonw/datasette.io"&gt;in this repo&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2021/datasette-io-baked-data.png" alt="A screenshot of the datasette.io homepage" style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The site is deployed using Cloud Run. It's actually a heavily customized Datasette instance, using &lt;a href="https://github.com/simonw/datasette.io/blob/main/templates/index.html"&gt;a custom template&lt;/a&gt; for the homepage, &lt;a href="https://github.com/simonw/datasette.io/tree/main/templates/pages"&gt;custom pages&lt;/a&gt; for other parts of the site and the &lt;a href="https://datasette.io/plugins/datasette-template-sql"&gt;datasette-template-sql&lt;/a&gt; plugin to execute SQL queries and display their results from those templates.&lt;/p&gt;
&lt;p&gt;The site currently runs off &lt;a href="https://datasette.io/-/databases"&gt;four database files&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/content"&gt;content.db&lt;/a&gt; has most of the site content. It is built inside GitHub Actions by the &lt;a href="https://github.com/simonw/datasette.io/blob/main/scripts/build.sh"&gt;build.sh&lt;/a&gt; script, which does the following:
&lt;ul&gt;
&lt;li&gt;Import the contents of the &lt;a href="https://github.com/simonw/datasette.io/blob/main/news.yaml"&gt;news.yaml&lt;/a&gt; file into a &lt;a href="https://datasette.io/content/news"&gt;news&lt;/a&gt; table using &lt;a href="https://datasette.io/tools/yaml-to-sqlite"&gt;yaml-to-sqlite&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Import the markdown files from the &lt;a href="https://github.com/simonw/datasette.io/tree/main/for"&gt;for/ folder&lt;/a&gt; (use-cases for Datasette) into the &lt;a href="https://datasette.io/content/uses"&gt;uses&lt;/a&gt; table using &lt;a href="https://datasette.io/tools/markdown-to-sqlite"&gt;markdown-to-sqlite&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Populate the &lt;a href="https://datasette.io/content/plugin_repos"&gt;plugin_repos&lt;/a&gt; and &lt;a href="https://datasette.io/content/plugin_repos"&gt;tool_repos&lt;/a&gt; single-column tables using data from more YAML files. These are used in the next step.&lt;/li&gt;
&lt;li&gt;Runs the &lt;a href="https://github.com/simonw/datasette.io/blob/main/build_directory.py"&gt;build_directory.py&lt;/a&gt; Python script. This uses the GitHub GraphQL API to fetch information about all of those plugin and tool repositories, including their README files and their most recent tagged &lt;a href="https://datasette.io/content/releases"&gt;releases&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Populates a &lt;a href="https://datasette.io/content/stats"&gt;stats&lt;/a&gt; table with the latest download statistics for all of the Datasette ecosystem PyPI packages. That data is imported from a &lt;code&gt;stats.json&lt;/code&gt; file in my &lt;a href="https://github.com/simonw/package-stats"&gt;simonw/package-stats&lt;/a&gt; repository, which is itself populated by this &lt;a href="https://github.com/simonw/package-stats/blob/main/.github/workflows/fetch_stats.yml"&gt;git scraping script&lt;/a&gt; that runs in GitHub Actions. I also use this for my &lt;a href="https://observablehq.com/@simonw/datasette-downloads-per-day-with-observable-plot"&gt;Datasette Downloads Observable notebook&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/blog"&gt;blog.db&lt;/a&gt; contains content from my blog that carries any of the &lt;a href="https://simonwillison.net/tags/datasette/"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dogsheep/"&gt;dogsheep&lt;/a&gt; or &lt;a href="https://simonwillison.net/tags/sqliteutils/"&gt;sqliteutils&lt;/a&gt; tags.
&lt;ul&gt;
&lt;li&gt;This is fetched by the &lt;a href="https://github.com/simonw/datasette.io/blob/main/fetch_blog_content.py"&gt;fetch_blog_content.py&lt;/a&gt; script, which hits the paginated per-tag Atom feed for my blog content, &lt;a href="https://github.com/simonw/simonwillisonblog/blob/a5b53a24b00d4c95c88c8371cfc17453b0726c23/blog/views.py#L411-L421"&gt;implemented in Django here&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/docs-index"&gt;docs-index.db&lt;/a&gt; is a database table containing the documentation for the most recent stable Datasette release, broken up by &lt;a href="https://datasette.io/docs-index/sections"&gt;sections&lt;/a&gt;.
&lt;ul&gt;
&lt;li&gt;This database file is downloaded from a separate site, &lt;a href="https://stable-docs.datasette.io/"&gt;stable-docs.datasette.io&lt;/a&gt;, which is built and deployed as &lt;a href="https://github.com/simonw/datasette/blob/0.58.1/.github/workflows/publish.yml#L60-L98"&gt;part of Datasette's release process&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://datasette.io/dogsheep-index"&gt;dogsheep-index.db&lt;/a&gt; is the search index that powers site search (e.g. &lt;a href="https://datasette.io/-/beta?q=dogsheep"&gt;this search for dogsheep&lt;/a&gt;).
&lt;ul&gt;
&lt;li&gt;The search index is built by &lt;a href="https://datasette.io/plugins/dogsheep-beta"&gt;dogsheep-beta&lt;/a&gt; using data pulled from tables in the other database files, as configured by &lt;a href="https://github.com/simonw/datasette.io/blob/main/templates/dogsheep-beta.yml"&gt;this YAML file&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The site is automatically deployed once a day by &lt;a href="https://github.com/simonw/datasette.io/blob/main/.github/workflows/deploy.yml"&gt;a scheduled action&lt;/a&gt;, and I can also manually trigger that action if I want to ensure a new software release is reflected on the homepage.&lt;/p&gt;
&lt;h4 id="other-real-world-examples"&gt;Other real-world examples of Baked Data&lt;/h4&gt;
&lt;p&gt;I'm currently running two other sites using this pattern:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.niche-museums.com/"&gt;Niche Museums&lt;/a&gt; is my blog about tiny museums that I've visited. Again, it's Datasette with custom templates. Most of the content comes from this &lt;a href="https://github.com/simonw/museums/blob/main/museums.yaml"&gt;museums.yaml&lt;/a&gt; file, but I also run &lt;a href="https://github.com/simonw/museums/blob/main/annotate_timestamps.py"&gt;a script&lt;/a&gt; to figure out when each item was created or updated from the git history.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/"&gt;My TILs site&lt;/a&gt; runs on &lt;a href="https://vercel.com/"&gt;Vercel&lt;/a&gt; and is built from my &lt;a href="https://github.com/simonw/til"&gt;simonw/til&lt;/a&gt; GitHub repository by &lt;a href="https://github.com/simonw/til/blob/main/build_database.py"&gt;this build script&lt;/a&gt; (populating &lt;a href="https://til.simonwillison.net/tils"&gt;this tils table&lt;/a&gt;). It uses the GitHub API to convert GitHub Flavored Markdown to HTML. I'm also running &lt;a href="https://github.com/simonw/til/blob/main/generate_screenshots.py"&gt;a script&lt;/a&gt; that generates small screenshots of each page and stashes them in a BLOB column in SQLite in order to provide social media preview cards, see &lt;a href="https://simonwillison.net/2020/Sep/3/weeknotes-airtable-screenshots-dogsheep/#weeknotes-2020-09-03-social-media-cards-tils"&gt;Social media cards for my TILs&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;My favourite example of this pattern in a site that I haven't worked on myself is &lt;a href="https://www.mozilla.org/"&gt;Mozilla.org&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;They started using SQLite back in 2018 in a system they call Bedrock - Paul McLanahan provides &lt;a href="https://mozilla.github.io/meao/2018/03/28/bedrock-the-sqlitening/"&gt;a detailed description&lt;/a&gt; of how this works.&lt;/p&gt;
&lt;p&gt;Their site content lives in a ~22MB SQLite database file, which is built and uploaded to S3 and then downloaded on a regular basis to each of their application servers.&lt;/p&gt;
&lt;p&gt;You can view &lt;a href="https://www.mozilla.org/healthz-cron/"&gt;their healthcheck page&lt;/a&gt; to see when the database was last downloaded, and grab a copy of the SQLite file yourself. It's fun to explore that using Datasette:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2021/mozilla-site-content.png" alt="Datasette running against the Mozilla contentncards_contentcard table" style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;h4 id="compared-to-ssgs"&gt;Compared to static site generators&lt;/h4&gt;
&lt;p&gt;Static site generators have exploded in popularity over the past ten years. They drive the cost of hosting a site down to almost nothing, provide excellent performance, work well with CDNs and produce sites that are extremely unlikely to break.&lt;/p&gt;
&lt;p&gt;Used carefully, the Baked Data keeps most of these characteristics while still enabling server-side code execution.&lt;/p&gt;
&lt;p&gt;My example sites use this in a few different ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;datasette.io &lt;a href="https://datasette.io/-/beta?q=search"&gt;provides search&lt;/a&gt; across 1,588 different pieces of content, plus &lt;a href="https://datasette.io/plugins?q=ics"&gt;simpler search&lt;/a&gt; on the plugins and tools pages.&lt;/li&gt;
&lt;li&gt;My TIL site also &lt;a href="https://til.simonwillison.net/tils/search?q=search"&gt;provides search&lt;/a&gt;, as &lt;a href="https://www.niche-museums.com/browse/search?q=bigfoot"&gt;does Niche Museums&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;All three sites provide Atom feeds that are configured using a server-side SQL query: &lt;a href="https://datasette.io/content/feed.atom"&gt;Datasette&lt;/a&gt;, &lt;a href="https://www.niche-museums.com/browse/feed.atom"&gt;Niche Museums&lt;/a&gt;, &lt;a href="https://til.simonwillison.net/tils/feed.atom"&gt;TILs&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Niche Museums offers a "Use my location" button which then serves &lt;a href="https://www.niche-museums.com/?latitude=37.5&amp;amp;longitude=-122.5"&gt;museums near you&lt;/a&gt;, using &lt;a href="https://www.niche-museums.com/browse/nearby"&gt;a SQL query&lt;/a&gt; that makes use of the &lt;a href="https://datasette.io/plugins/datasette-haversine"&gt;datasette-haversine&lt;/a&gt; plugin.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A common complaint about static site generators when used for larger sites is that build times can get pretty long if the builder has to generate tens of thousands of pages.&lt;/p&gt;
&lt;p&gt;With Baked Data, 10,000 pages can be generated by a single template file and 10,000 rows in a SQLite database table.&lt;/p&gt;
&lt;p&gt;This also makes for a faster iteration cycle during development: you can edit a template and hit "refresh" to see any page rendered by the new template instantly, without needing to rebuild any pages.&lt;/p&gt;
&lt;h4 id="give-this-a-go"&gt;Want to give this a go?&lt;/h4&gt;
&lt;p&gt;If you want to give the Baked Data pattern a try, I recommend starting out using the combination of &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt;, &lt;a href="https://docs.github.com/en/actions"&gt;GitHub Actions&lt;/a&gt; and &lt;a href="https://vercel.com/"&gt;Vercel&lt;/a&gt;. Hopefully the examples I've provided above are a good starting point - also feel free to reach out to me &lt;a href="https://twitter.com/"&gt;on Twitter&lt;/a&gt; or in &lt;a href="https://github.com/simonw/datasette/discussions"&gt;the Datasette Discussions forum&lt;/a&gt; with any questions.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/definitions"&gt;definitions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/static-generator"&gt;static-generator&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/baked-data"&gt;baked-data&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="definitions"/><category term="design-patterns"/><category term="sqlite"/><category term="static-generator"/><category term="datasette"/><category term="baked-data"/></entry><entry><title>Documentation unit tests</title><link href="https://simonwillison.net/2018/Jul/28/documentation-unit-tests/#atom-tag" rel="alternate"/><published>2018-07-28T15:59:55+00:00</published><updated>2018-07-28T15:59:55+00:00</updated><id>https://simonwillison.net/2018/Jul/28/documentation-unit-tests/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;Or: Test-driven documentation.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Keeping documentation synchronized with an evolving codebase is difficult. Without extreme discipline, it’s easy for documentation to get out-of-date as new features are added.&lt;/p&gt;
&lt;p&gt;One thing that can help is keeping the documentation for a project in the same repository as the code itself. This allows you to construct the ideal commit: one that includes the code change, the updated unit tests AND the accompanying documentation all in the same unit of work.&lt;/p&gt;
&lt;p&gt;When combined with a code review system (like &lt;a href="https://www.phacility.com/phabricator/"&gt;Phabricator&lt;/a&gt; or &lt;a href="https://help.github.com/articles/about-pull-requests/"&gt;GitHub pull requests&lt;/a&gt;) this pattern lets you enforce documentation updates as part of the review process: if a change doesn’t update the relevant documentation, point that out in your review!&lt;/p&gt;
&lt;p&gt;Good code review systems also execute unit tests automatically and attach the results to the review. This provides an opportunity to have the tests enforce other aspects of the codebase: for example, running a linter so that no-one has to waste their time arguing over standardize coding style.&lt;/p&gt;
&lt;p&gt;I’ve been experimenting with using unit tests to ensure that aspects of a project are covered by the documentation. I think it’s a very promising technique.&lt;/p&gt;
&lt;h4 id="Introspect_the_code_introspect_the_docs_12"&gt;Introspect the code, introspect the docs&lt;/h4&gt;
&lt;p&gt;The key to this trick is introspection: interogating the code to figure out what needs to be documented, then parsing the documentation to see if each item has been covered.&lt;/p&gt;
&lt;p&gt;I’ll use my &lt;a href="https://github.com/simonw/datasette"&gt;Datasette&lt;/a&gt; project as an example. Datasette’s &lt;a href="https://github.com/simonw/datasette/blob/295d005ca48747faf046ed30c3c61e7563c61ed2/tests/test_docs.py"&gt;test_docs.py&lt;/a&gt; module contains three relevant tests:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;test_config_options_are_documented&lt;/code&gt; checks that every one of Datasette’s &lt;a href="http://datasette.readthedocs.io/en/latest/config.html"&gt;configuration options&lt;/a&gt; are documented.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;test_plugin_hooks_are_documented&lt;/code&gt; ensures all of the plugin hooks (powered by &lt;a href="https://pluggy.readthedocs.io/en/latest/"&gt;pluggy&lt;/a&gt;) are covered in the &lt;a href="http://datasette.readthedocs.io/en/latest/plugins.html#plugin-hooks"&gt;plugin documentation&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;test_view_classes_are_documented&lt;/code&gt; iterates through all of the &lt;code&gt;*View&lt;/code&gt; classes (corresponding to pages in the Datasette user interface) and makes sure &lt;a href="http://datasette.readthedocs.io/en/latest/pages.html"&gt;they are covered&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In each case, the test uses introspection against the relevant code areas to figure out what needs to be documented, then runs a regular expression against the documentation to make sure it is mentioned in the correct place.&lt;/p&gt;
&lt;p&gt;Obviously the tests can’t confirm the quality of the documentation, so they are easy to cheat: but they do at least protect against adding a new option but forgetting to document it.&lt;/p&gt;
&lt;h4 id="Testing_that_Datasettes_view_classes_are_covered_26"&gt;Testing that Datasette’s view classes are covered&lt;/h4&gt;
&lt;p&gt;Datasette’s view classes use a naming convention: they all end in &lt;code&gt;View&lt;/code&gt;. The current list of view classes is &lt;code&gt;DatabaseView&lt;/code&gt;, &lt;code&gt;TableView&lt;/code&gt;, &lt;code&gt;RowView&lt;/code&gt;, &lt;code&gt;IndexView&lt;/code&gt; and &lt;code&gt;JsonDataView&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Since these classes are all imported into the &lt;a href="https://github.com/simonw/datasette/blob/295d005ca48747faf046ed30c3c61e7563c61ed2/datasette/app.py"&gt;datasette.app&lt;/a&gt; module (in order to be hooked up to URL routes) the easiest way to introspect them is to import that module, then run &lt;code&gt;dir(app)&lt;/code&gt; and grab any class names that end in &lt;code&gt;View&lt;/code&gt;. We can do that with a Python list comprehension:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from datasette import app
views = [v for v in dir(app) if v.endswith(&amp;quot;View&amp;quot;)]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I’m using reStructuredText labels to mark the place in the documentation that addresses each of these classes. This also ensures that each documentation section can be linked to, for example:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://datasette.readthedocs.io/en/latest/pages.html#tableview"&gt;http://datasette.readthedocs.io/en/latest/pages.html#tableview&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The reStructuredText syntax for that label looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;.. _TableView:

Table
=====

The table page is the heart of Datasette...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can extract these labels using a regular expression:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from pathlib import Path
import re

docs_path = Path(__file__).parent.parent / 'docs'
label_re = re.compile(r'\.\. _([^\s:]+):')

def get_labels(filename):
    contents = (docs_path / filename).open().read()
    return set(label_re.findall(contents))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since Datasette’s documentation is spread across multiple &lt;code&gt;*.rst&lt;/code&gt; files, and I want the freedom to document a view class in any one of them, I iterate through every file to find the labels and pull out the ones ending in &lt;code&gt;View&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def documented_views():
    view_labels = set()
    for filename in docs_path.glob(&amp;quot;*.rst&amp;quot;):
        for label in get_labels(filename):
            first_word = label.split(&amp;quot;_&amp;quot;)[0]
            if first_word.endswith(&amp;quot;View&amp;quot;):
                view_labels.add(first_word)
    return view_labels
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We now have a list of class names and a list of labels across all of our documentation. Writing a basic unit test comparing the two lists is trivial:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def test_view_documentation():
    view_labels = documented_views()
    view_classes = set(v for v in dir(app) if v.endswith(&amp;quot;View&amp;quot;))
    assert view_labels == view_classes
&lt;/code&gt;&lt;/pre&gt;
&lt;h4 id="Taking_advantage_of_pytest_78"&gt;Taking advantage of pytest&lt;/h4&gt;
&lt;p&gt;Datasette uses &lt;a href="https://pytest.org/"&gt;pytest&lt;/a&gt; for its unit tests, and documentation unit tests are a great opportunity to take advantage of some advanced pytest features.&lt;/p&gt;
&lt;h5 id="Parametrization_82"&gt;Parametrization&lt;/h5&gt;
&lt;p&gt;The first of these is &lt;a href="https://docs.pytest.org/en/6.2.x/parametrize.html"&gt;parametrization&lt;/a&gt;: pytest provides a decorator which can be used to execute a single test function multiple times, each time with different arguments.&lt;/p&gt;
&lt;p&gt;This example from the pytest documentation shows how parametrization works:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import pytest
@pytest.mark.parametrize(&amp;quot;test_input,expected&amp;quot;, [
    (&amp;quot;3+5&amp;quot;, 8),
    (&amp;quot;2+4&amp;quot;, 6),
    (&amp;quot;6*9&amp;quot;, 42),
])
def test_eval(test_input, expected):
    assert eval(test_input) == expected
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;pytest treats this as three separate unit tests, even though they share a single function definition.&lt;/p&gt;
&lt;p&gt;We can combine this pattern with our introspection to execute an independent unit test for each of our view classes. Here’s what that looks like:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;@pytest.mark.parametrize(&amp;quot;view&amp;quot;, [v for v in dir(app) if v.endswith(&amp;quot;View&amp;quot;)])
def test_view_classes_are_documented(view):
    assert view in documented_views()
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here’s the output from pytest if we execute just this unit test (and one of our classes is undocumented):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ pytest -k test_view_classes_are_documented -v
=== test session starts ===
collected 249 items / 244 deselected

tests/test_docs.py::test_view_classes_are_documented[DatabaseView] PASSED [ 20%]
tests/test_docs.py::test_view_classes_are_documented[IndexView] PASSED [ 40%]
tests/test_docs.py::test_view_classes_are_documented[JsonDataView] PASSED [ 60%]
tests/test_docs.py::test_view_classes_are_documented[RowView] PASSED [ 80%]
tests/test_docs.py::test_view_classes_are_documented[TableView] FAILED [100%]

=== FAILURES ===

view = 'TableView'

    @pytest.mark.parametrize(&amp;quot;view&amp;quot;, [v for v in dir(app) if v.endswith(&amp;quot;View&amp;quot;)])
    def test_view_classes_are_documented(view):
&amp;gt;       assert view in documented_views()
E       AssertionError: assert 'TableView' in {'DatabaseView', 'IndexView', 'JsonDataView', 'RowView', 'Table2View'}
E        +  where {'DatabaseView', 'IndexView', 'JsonDataView', 'RowView', 'Table2View'} = documented_views()

tests/test_docs.py:77: AssertionError
=== 1 failed, 4 passed, 244 deselected in 1.13 seconds ===
&lt;/code&gt;&lt;/pre&gt;
&lt;h5 id="Fixtures_130"&gt;Fixtures&lt;/h5&gt;
&lt;p&gt;There’s a subtle inefficiency in the above test: for every view class, it calls the &lt;code&gt;documented_views()&lt;/code&gt; function - and that function then iterates through every &lt;code&gt;*.rst&lt;/code&gt; file in the &lt;code&gt;docs/&lt;/code&gt; directory and uses a regular expression to extract the labels. With 5 view classes and 17 documentation files that’s 85 executions of &lt;code&gt;get_labels()&lt;/code&gt;, and that number will only increase as Datasette’s code and documentation grow larger.&lt;/p&gt;
&lt;p&gt;We can use pytest’s neat &lt;a href="https://docs.pytest.org/en/6.2.x/fixture.html"&gt;fixtures&lt;/a&gt; to reduce this to a single call to &lt;code&gt;documented_views()&lt;/code&gt; that is shared across all of the tests. Here’s what that looks like:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;@pytest.fixture(scope=&amp;quot;session&amp;quot;)
def documented_views():
    view_labels = set()
    for filename in docs_path.glob(&amp;quot;*.rst&amp;quot;):
        for label in get_labels(filename):
            first_word = label.split(&amp;quot;_&amp;quot;)[0]
            if first_word.endswith(&amp;quot;View&amp;quot;):
                view_labels.add(first_word)
    return view_labels

@pytest.mark.parametrize(&amp;quot;view_class&amp;quot;, [
    v for v in dir(app) if v.endswith(&amp;quot;View&amp;quot;)
])
def test_view_classes_are_documented(documented_views, view_class):
    assert view_class in documented_views
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Fixtures in pytest are an example of dependency injection: pytest introspects every &lt;code&gt;test_*&lt;/code&gt; function and checks if it has a function argument with a name matching something that has been annotated with the &lt;code&gt;@pytest.fixture&lt;/code&gt; decorator. If it finds any matching arguments, it executes the matching fixture function and passes its return value in to the test function.&lt;/p&gt;
&lt;p&gt;By default, pytest will execute the fixture function once for every test execution. In the above code we use the &lt;code&gt;scope=&amp;quot;session&amp;quot;&lt;/code&gt; argument to tell pytest that this particular fixture should be executed only once for every &lt;code&gt;pytest&lt;/code&gt; command-line execution of the tests, and that single return value should be passed to every matching test.&lt;/p&gt;
&lt;h4 id="What_if_you_havent_documented_everything_yet_157"&gt;What if you haven’t documented everything yet?&lt;/h4&gt;
&lt;p&gt;Adding unit tests to your documentation in this way faces an obvious problem: when you first add the tests, you may have to write a whole lot of documentation before they can all pass.&lt;/p&gt;
&lt;p&gt;Having tests that protect against future code being added without documentation is only useful once you’ve added them to the codebase - but blocking that on documenting your existing features could prevent that benefit from ever manifesting itself.&lt;/p&gt;
&lt;p&gt;Once again, pytest to the rescue. The &lt;code&gt;@pytest.mark.xfail&lt;/code&gt; decorator allows you to mark a test as “expected to fail” - if it fails, pytest will take note but will not fail the entire test suite.&lt;/p&gt;
&lt;p&gt;This means you can add deliberately failing tests to your codebase without breaking the build for everyone - perfect for tests that look for documentation that hasn’t yet been written!&lt;/p&gt;
&lt;p&gt;I used &lt;code&gt;xfail&lt;/code&gt; when I &lt;a href="https://github.com/simonw/datasette/commit/e8625695a3b7938f37b64dff09c14e47d9428fe5"&gt;first added view documentation tests&lt;/a&gt; to Datasette, then removed it once the documentation was all in place. Any future code in pull requests without documentation will cause a hard test failure.&lt;/p&gt;
&lt;p&gt;Here’s what the test output looks like when some of those tests are marked as “expected to fail”:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ pytest tests/test_docs.py
collected 31 items

tests/test_docs.py ..........................XXXxx.                [100%]

============ 26 passed, 2 xfailed, 3 xpassed in 1.06 seconds ============
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since this reports both the xfailed &lt;em&gt;and&lt;/em&gt; the xpassed counts, it shows how much work is still left to be done before the &lt;code&gt;xfail&lt;/code&gt; decorator can be safely removed.&lt;/p&gt;
&lt;h4 id="Structuring_code_for_testable_documentation_180"&gt;Structuring code for testable documentation&lt;/h4&gt;
&lt;p&gt;A benefit of comprehensive unit testing is that it encourages you to design your code in a way that is easy to test. In my experience this leads to much higher code quality in general: it encourages separation of concerns and cleanly decoupled components.&lt;/p&gt;
&lt;p&gt;My hope is that documentation unit tests will have a similar effect. I’m already starting to think about ways of restructuring my code such that I can cleanly introspect it for the areas that need to be documented. I’m looking forward to discovering code design patterns that help support this goal.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/documentation"&gt;documentation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/restructuredtext"&gt;restructuredtext&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/testing"&gt;testing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pytest"&gt;pytest&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="design-patterns"/><category term="documentation"/><category term="restructuredtext"/><category term="testing"/><category term="datasette"/><category term="pytest"/></entry><entry><title>The denormalized query engine design pattern</title><link href="https://simonwillison.net/2017/Aug/16/denormalized-query-engine/#atom-tag" rel="alternate"/><published>2017-08-16T22:49:22+00:00</published><updated>2017-08-16T22:49:22+00:00</updated><id>https://simonwillison.net/2017/Aug/16/denormalized-query-engine/#atom-tag</id><summary type="html">
    &lt;p&gt;I presented this talk &lt;a href="https://2017.djangocon.us/talks/the-denormalized-query-engine-design-pattern/"&gt;at DjangoCon 2017&lt;/a&gt; in Spokane, Washington. Below is the abstract, the slides and the YouTube video of the talk.&lt;/p&gt;
&lt;h4 id="abstract"&gt;Abstract&lt;/h4&gt;
&lt;p&gt;Most web applications need to offer search functionality. Open source tools like Solr and Elasticsearch are a powerful option for building custom search engines… but it turns out they can be used for way more than just search.&lt;/p&gt;
&lt;p&gt;By treating your search engine as a denormalization layer, you can use it to answer queries that would be too expensive to answer using your core relational database. Questions like “What are the top twenty tags used by my users from Spain?” or “What are the most common times of day for events to start?” or “Which articles contain addresses within 500 miles of Toronto?”.&lt;/p&gt;
&lt;p&gt;With the denormalized query engine design pattern, modifications to relational data are published to a denormalized schema in Elasticsearch or Solr. Data queries can then be answered using either the relational database or the search engine, depending on the nature of the specific query. The search engine returns database IDs, which are inflated from the database before being displayed to a user - ensuring that users never see stale data even if the search engine is not 100% up to date with the latest changes. This opens up all kinds of new capabilities for slicing, dicing and exploring data.&lt;/p&gt;
&lt;p&gt;In this talk, I’ll be illustrating this pattern by focusing on Elasticsearch - showing how it can be used with Django to bring new capabilities to your application. I’ll discuss the challenge of keeping data synchronized between a relational database and a search engine, and show examples of features that become much easier to build once you have this denormalization layer in place.&lt;/p&gt;

&lt;h4 id="denorm-query-video"&gt;Video&lt;/h4&gt;

&lt;iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/NzcvewgqYog" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="allowfullscreen"&gt;
&lt;/iframe&gt;

&lt;h4 id="denorm-query-slides"&gt;Slides&lt;/h4&gt;

&lt;iframe class="speakerdeck-iframe" style="border: 0px; background: rgba(0, 0, 0, 0.1) padding-box; margin: 0px; padding: 0px; border-radius: 6px; box-shadow: rgba(0, 0, 0, 0.2) 0px 5px 40px; width: 100%; height: auto; aspect-ratio: 560 / 420;" frameborder="0" src="https://speakerdeck.com/player/465a2d2f25bc449ebdafd19247ec9712" title="The denormalized query engine design pattern" allowfullscreen="true" data-ratio="1.3333333333333333"&gt;
&lt;/iframe&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/architecture"&gt;architecture&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/definitions"&gt;definitions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/denormalisation"&gt;denormalisation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/djangocon"&gt;djangocon&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/elasticsearch"&gt;elasticsearch&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/my-talks"&gt;my-talks&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="architecture"/><category term="definitions"/><category term="denormalisation"/><category term="design-patterns"/><category term="django"/><category term="djangocon"/><category term="elasticsearch"/><category term="my-talks"/></entry><entry><title>What are the main weaknesses of Java as a programming language?</title><link href="https://simonwillison.net/2010/Oct/15/what-are-the-main-java/#atom-tag" rel="alternate"/><published>2010-10-15T14:25:00+00:00</published><updated>2010-10-15T14:25:00+00:00</updated><id>https://simonwillison.net/2010/Oct/15/what-are-the-main-java/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/What-are-the-main-weaknesses-of-Java-as-a-programming-language/answer/Simon-Willison"&gt;What are the main weaknesses of Java as a programming language?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A cultural bias towards over-engineering. In my experience Java code often ends up a huge network of Factories and AbstractFactories and Visitors and XML configuration files and every design pattern you care to mention, dozens of classes many of which contain hardly any procedural code at all. A lot of Java projects are essentially impossible to navigate without an IDE.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/java"&gt;java&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/programming-languages"&gt;programming-languages&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="design-patterns"/><category term="java"/><category term="programming-languages"/><category term="quora"/></entry><entry><title>The Universal Design Pattern</title><link href="https://simonwillison.net/2008/Oct/20/steveys/#atom-tag" rel="alternate"/><published>2008-10-20T23:13:47+00:00</published><updated>2008-10-20T23:13:47+00:00</updated><id>https://simonwillison.net/2008/Oct/20/steveys/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://steve-yegge.blogspot.com/2008/10/universal-design-pattern.html#Property"&gt;The Universal Design Pattern&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Steve Yegge presents a small book on key/value pairs and prototypal inheritance. “I call it the Universal design pattern because it is (by far) the best known solution to the problem of designing open-ended systems, which in turn translates to long-lived systems.”


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/keyvaluepairs"&gt;keyvaluepairs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/programming"&gt;programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prototypal-inheritance"&gt;prototypal-inheritance&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/steve-yegge"&gt;steve-yegge&lt;/a&gt;&lt;/p&gt;



</summary><category term="design-patterns"/><category term="keyvaluepairs"/><category term="programming"/><category term="prototypal-inheritance"/><category term="steve-yegge"/></entry><entry><title>ActsAsUndoable</title><link href="https://simonwillison.net/2007/Sep/18/actsasundoable/#atom-tag" rel="alternate"/><published>2007-09-18T15:51:14+00:00</published><updated>2007-09-18T15:51:14+00:00</updated><id>https://simonwillison.net/2007/Sep/18/actsasundoable/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.nodetraveller.com/blog/javascript/actasundo/"&gt;ActsAsUndoable&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Lawrence Carvalho shows how robust undo functionality can be added to a JavaScript application through careful application of the Memento design pattern.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/actsasundoable"&gt;actsasundoable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/lawrence-carvalho"&gt;lawrence-carvalho&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memento"&gt;memento&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/undo"&gt;undo&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yui"&gt;yui&lt;/a&gt;&lt;/p&gt;



</summary><category term="actsasundoable"/><category term="design-patterns"/><category term="javascript"/><category term="lawrence-carvalho"/><category term="memento"/><category term="undo"/><category term="yui"/></entry><entry><title>factoryjoe: Design Patterns</title><link href="https://simonwillison.net/2007/Apr/10/collection/#atom-tag" rel="alternate"/><published>2007-04-10T11:22:59+00:00</published><updated>2007-04-10T11:22:59+00:00</updated><id>https://simonwillison.net/2007/Apr/10/collection/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.flickr.com/photos/factoryjoe/collections/72157600001823120/"&gt;factoryjoe: Design Patterns&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Chris Messina’s collection of user interface design pattern screenshots, collated on Flickr.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/chris-messina"&gt;chris-messina&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design"&gt;design&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/flickr"&gt;flickr&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ui"&gt;ui&lt;/a&gt;&lt;/p&gt;



</summary><category term="chris-messina"/><category term="design"/><category term="design-patterns"/><category term="flickr"/><category term="ui"/></entry><entry><title>Design patterns of 1972</title><link href="https://simonwillison.net/2006/Sep/14/design/#atom-tag" rel="alternate"/><published>2006-09-14T05:57:54+00:00</published><updated>2006-09-14T05:57:54+00:00</updated><id>https://simonwillison.net/2006/Sep/14/design/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://newbabe.pobox.com/~mjd/blog/prog/design-patterns.html"&gt;Design patterns of 1972&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Patterns are signs of weakness in programming languages.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://www.parand.com/say/index.php/2006/09/13/patterns-are-a-sign-of-weakness/"&gt;Standard Deviations&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;&lt;/p&gt;



</summary><category term="design-patterns"/></entry><entry><title>Yahoo! Design Pattern Library</title><link href="https://simonwillison.net/2006/Feb/14/design/#atom-tag" rel="alternate"/><published>2006-02-14T01:12:13+00:00</published><updated>2006-02-14T01:12:13+00:00</updated><id>https://simonwillison.net/2006/Feb/14/design/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://developer.yahoo.net/ypatterns/"&gt;Yahoo! Design Pattern Library&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Common UI design patterns for web applications.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design"&gt;design&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;&lt;/p&gt;



</summary><category term="design"/><category term="design-patterns"/></entry><entry><title>Why the term Ajax is useful</title><link href="https://simonwillison.net/2005/Apr/19/useful/#atom-tag" rel="alternate"/><published>2005-04-19T01:15:58+00:00</published><updated>2005-04-19T01:15:58+00:00</updated><id>https://simonwillison.net/2005/Apr/19/useful/#atom-tag</id><summary type="html">
    &lt;p id="p-0"&gt;Software design patterns are useful mainly because they provide a shared vocabulary: rather than discussing the intimate details of a three layered application architecture, we say "MVC". Rather than describing an object that tracks your progress while looping over a collection, we say "Iterator".&lt;/p&gt;

&lt;p id="p-1"&gt;The same is true for &lt;a href="http://www.adaptivepath.com/publications/essays/archives/000385.php"&gt;Ajax&lt;/a&gt;. While the techniques it describes have been around for years, grouping them under a single term is extremely valuable for raising the level of discussion about them. No longer will we have to explain XMLHttpRequest / hidden iframes / crazy cookie tricks in depth when discussing sites which pull fresh information from the server without reloading the whole page. Instead, we can say "Ajax" and move on to more interesting things.&lt;/p&gt;

&lt;p id="p-2"&gt;Matthew Haughey says it's &lt;a href="http://a.wholelottanothing.org/2005/04/note_to_geeks_l.html" title="Note to geeks: look beyond the end of your nose"&gt;all about marketing&lt;/a&gt;. I disagree; it's about smarter and more effective conversations.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ajax"&gt;ajax&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jargon"&gt;jargon&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ajax"/><category term="design-patterns"/><category term="jargon"/></entry><entry><title>Patterns of Intermediation</title><link href="https://simonwillison.net/2005/Apr/3/patterns/#atom-tag" rel="alternate"/><published>2005-04-03T19:16:01+00:00</published><updated>2005-04-03T19:16:01+00:00</updated><id>https://simonwillison.net/2005/Apr/3/patterns/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.ldodds.com/projects/patterns/patterns_of_intermediation.html"&gt;Patterns of Intermediation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Design patterns for bookmarklets, greasemonkey and similar.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/bookmarklets"&gt;bookmarklets&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/greasemonkey"&gt;greasemonkey&lt;/a&gt;&lt;/p&gt;



</summary><category term="bookmarklets"/><category term="design-patterns"/><category term="greasemonkey"/></entry><entry><title>Dating Design Patterns</title><link href="https://simonwillison.net/2003/Dec/2/dating/#atom-tag" rel="alternate"/><published>2003-12-02T20:54:20+00:00</published><updated>2003-12-02T20:54:20+00:00</updated><id>https://simonwillison.net/2003/Dec/2/dating/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.datingdesignpatterns.com/"&gt;Dating Design Patterns&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Reusable solutions for a complex system

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://rc3.org/cgi-bin/less.pl?arg=5798"&gt;rc3.org | Dating Design Patterns&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;&lt;/p&gt;



</summary><category term="design-patterns"/></entry><entry><title>phpPatterns</title><link href="https://simonwillison.net/2002/Nov/24/phpPatterns/#atom-tag" rel="alternate"/><published>2002-11-24T16:13:16+00:00</published><updated>2002-11-24T16:13:16+00:00</updated><id>https://simonwillison.net/2002/Nov/24/phpPatterns/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="http://www.phppatterns.com/"&gt;phpPatterns&lt;/a&gt; is a brand new site which advocates and documents the use of object oriented design patterns with &lt;acronym title="PHP: Hypertext Preprocessor"&gt;PHP&lt;/acronym&gt;. It's a great concept and the site already has some impressive content (although it could really do with a PHP references tutorial). The site is a project of Harry Fuecks, a regular contributor to &lt;a href="http://www.sitepointforums.com/forumdisplay.php?forumid=34"&gt;SitePoint's PHP forums&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/php"&gt;php&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="design-patterns"/><category term="php"/></entry><entry><title>Enterprise Application Architecture</title><link href="https://simonwillison.net/2002/Jun/26/enterpriseApplicationArchitect/#atom-tag" rel="alternate"/><published>2002-06-26T23:12:45+00:00</published><updated>2002-06-26T23:12:45+00:00</updated><id>https://simonwillison.net/2002/Jun/26/enterpriseApplicationArchitect/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="http://www.martinfowler.com/isa/index.html"&gt;Enterprise Application Architecture&lt;/a&gt; by Martin Fowler: A fantastic book on Object Oriented patterns and how they can be applied to large software projects. The book is available on the web as a work-in-progress and I can safely say I've never found an online resource that has taught me more about software design. Literally 24 hours after finding it my head is swimming with design patterns, domain models and relational database mapping techniques, and I've already started using some of the patterns in my latest project. A big thanks to Captain Proton on the SitePoint forums for &lt;a href="http://www.sitepointforums.com/showthread.php?threadid=57712&amp;amp;pagenumber=2" title="OOP abstraction layer"&gt;pointing it out&lt;/a&gt; (and also for &lt;a href="http://www.sitepointforums.com/showthread.php?threadid=65290" title="Help needed with PHP references stored as object properties"&gt;helping me understand PHP references&lt;/a&gt; a few days ago). I thoroughly recommend this to anyone who is serious about learning Object Oriented design, or indeed any OO-capable language.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/design-patterns"&gt;design-patterns&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/martin-fowler"&gt;martin-fowler&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="design-patterns"/><category term="martin-fowler"/></entry></feed>