Entries tagged security

Filters: Type: entry × security × Sorted by date

74 results page 1 / 3 next » last »»

The lethal trifecta for AI agents: private data, untrusted content, and external communication

If you are a user of LLM systems that use tools (you can call them “AI agents” if you like) it is critically important that you understand the risk of combining tools with the following three characteristics. Failing to understand this can let an attacker steal your data.

[... 1,324 words]

1:20 pm / 16th June 2025 / security, ai, prompt-injection, generative-ai, llms, exfiltration-attacks, ai-agents, model-context-protocol, lethal-trifecta

An Introduction to Google’s Approach to AI Agent Security

Here’s another new paper on AI agent security: An Introduction to Google’s Approach to AI Agent Security, by Santiago Díaz, Christoph Kern, and Kara Olive.

[... 2,064 words]

5:28 am / 15th June 2025 / google, security, ai, prompt-injection, generative-ai, llms, exfiltration-attacks, ai-agents, paper-review, agent-definitions

Design Patterns for Securing LLM Agents against Prompt Injections

This new paper by 11 authors from organizations including IBM, Invariant Labs, ETH Zurich, Google and Microsoft is an excellent addition to the literature on prompt injection and LLM security.

[... 1,795 words]

1:26 pm / 13th June 2025 / design-patterns, security, ai, prompt-injection, generative-ai, llms, exfiltration-attacks, ai-agents, paper-review

CaMeL offers a promising new direction for mitigating prompt injection attacks

In the two and a half years that we’ve been talking about prompt injection attacks I’ve seen alarmingly little progress towards a robust solution. The new paper Defeating Prompt Injections by Design from Google DeepMind finally bucks that trend. This one is worth paying attention to.

[... 2,052 words]

8:50 pm / 11th April 2025 / google, python, security, ai, prompt-injection, generative-ai, llms, paper-review

Model Context Protocol has prompt injection security problems

As more people start hacking around with implementations of MCP (the Model Context Protocol, a new standard for making tools available to LLM-powered systems) the security implications of tools built on that protocol are starting to come into focus.

[... 1,559 words]

12:59 pm / 9th April 2025 / security, ai, prompt-injection, generative-ai, llms, exfiltration-attacks, llm-tool-use, ai-agents, model-context-protocol

ChatGPT Canvas can make API requests now, but it’s complicated

Today’s 12 Days of OpenAI release concerned ChatGPT Canvas, a new ChatGPT feature that enables ChatGPT to pop open a side panel with a shared editor in it where you can collaborate with ChatGPT on editing a document or writing code.

[... 1,116 words]

9:49 pm / 10th December 2024 / python, security, usability, ai, webassembly, pyodide, openai, prompt-injection, generative-ai, chatgpt, llms, claude-artifacts, cors

Claude’s API now supports CORS requests, enabling client-side applications

Anthropic have enabled CORS support for their JSON APIs, which means it’s now possible to call the Claude LLMs directly from a user’s browser.

[... 625 words]

2:29 am / 23rd August 2024 / apis, javascript, projects, security, ai, generative-ai, llms, ai-assisted-programming, anthropic, claude, cors

Thoughts on the WWDC 2024 keynote on Apple Intelligence

Today’s WWDC keynote finally revealed Apple’s new set of AI features. The AI section (Apple are calling it Apple Intelligence) started over an hour into the keynote—this link jumps straight to that point in the archived YouTube livestream, or you can watch it embedded here:

[... 855 words]

8:19 pm / 10th June 2024 / apple, ethics, privacy, security, trust, ai, openai, prompt-injection, generative-ai, chatgpt, llms, apple-intelligence, ai-ethics

Prompt injection and jailbreaking are not the same thing

I keep seeing people use the term “prompt injection” when they’re actually talking about “jailbreaking”.

[... 1,157 words]

4:05 pm / 5th March 2024 / jailbreaking, security, ai, prompt-injection, generative-ai, llms, semantic-diffusion

Weeknotes: Page caching and custom templates for Datasette Cloud

My main development focus this week has been adding public page caching to Datasette Cloud, and exploring what custom template support might look like for that service.

[... 924 words]

8:45 pm / 7th January 2024 / caching, security, varnish, xss, datasette, cloudflare, weeknotes, datasette-cloud

Recommendations to help mitigate prompt injection: limit the blast radius

I’m in the latest episode of RedMonk’s Conversation series, talking with Kate Holterhoff about the prompt injection class of security vulnerabilities: what it is, why it’s so dangerous and why the industry response to it so far has been pretty disappointing.

[... 539 words]

8:34 pm / 20th December 2023 / podcasts, security, ai, prompt-injection, generative-ai, llms, exfiltration-attacks, podcast-appearances

Prompt injection explained, November 2023 edition

A neat thing about podcast appearances is that, thanks to Whisper transcriptions, I can often repurpose parts of them as written content for my blog.

[... 1,357 words]

3:55 am / 27th November 2023 / data-journalism, podcasts, security, ai, prompt-injection, generative-ai, llms, podcast-appearances

Multi-modal prompt injection image attacks against GPT-4V

GPT4-V is the new mode of GPT-4 that allows you to upload images as part of your conversations. It’s absolutely brilliant. It also provides a whole new set of vectors for prompt injection attacks.

[... 889 words]

2:24 am / 14th October 2023 / security, ai, openai, prompt-injection, generative-ai, gpt-4, exfiltration-attacks, vision-llms, johann-rehberger

Delimiters won’t save you from prompt injection

Prompt injection remains an unsolved problem. The best we can do at the moment, disappointingly, is to raise awareness of the issue. As I pointed out last week, “if you don’t understand it, you are doomed to implement it.”

[... 1,010 words]

3:51 pm / 11th May 2023 / security, ai, openai, prompt-engineering, prompt-injection, generative-ai, llms, andrew-ng

Prompt injection explained, with video, slides, and a transcript

I participated in a webinar this morning about prompt injection, organized by LangChain and hosted by Harrison Chase, with Willem Pienaar, Kojin Oshiba (Robust Intelligence), and Jonathan Cohen and Christopher Parisien (Nvidia Research).

[... 3,120 words]

8:22 pm / 2nd May 2023 / security, my-talks, ai, prompt-engineering, prompt-injection, generative-ai, llms, annotated-talks, exfiltration-attacks

The Dual LLM pattern for building AI assistants that can resist prompt injection

I really want an AI assistant: a Large Language Model powered chatbot that can answer questions and perform actions for me based on access to my private data and tools.

[... 2,632 words]

7 pm / 25th April 2023 / security, ai, prompt-engineering, prompt-injection, generative-ai, llms, exfiltration-attacks, ai-agents

Prompt injection: What’s the worst that can happen?

Activity around building sophisticated applications on top of LLMs (Large Language Models) such as GPT-3/4/ChatGPT/etc is growing like wildfire right now.

[... 2,302 words]

5:35 pm / 14th April 2023 / security, ai, openai, prompt-engineering, prompt-injection, generative-ai, chatgpt, llms, exfiltration-attacks, ai-agents

Datasette 0.64, with a warning about SpatiaLite

I release Datasette 0.64 this morning. This release is mainly a response to the realization that it’s not safe to run Datasette with the SpatiaLite extension loaded if that Datasette instance is configured to enable arbitrary SQL queries from untrusted users.

[... 675 words]

9:22 pm / 9th January 2023 / security, spatialite, datasette, annotated-release-notes

You can’t solve AI security problems with more AI

One of the most common proposed solutions to prompt injection attacks (where an AI language model backed system is subverted by a user injecting malicious input—“ignore previous instructions and do this instead”) is to apply more AI to the problem.

[... 1,288 words]

10:57 pm / 17th September 2022 / security, ai, gpt-3, openai, prompt-engineering, prompt-injection, generative-ai, llms

I don’t know how to solve prompt injection

Some extended thoughts about prompt injection attacks against software built on top of AI language models such a GPT-3. This post started as a Twitter thread but I’m promoting it to a full blog entry here.

[... 581 words]

4:28 pm / 16th September 2022 / security, ai, openai, prompt-engineering, prompt-injection, generative-ai, llms, glyph

Prompt injection attacks against GPT-3

Riley Goodside, yesterday:

[... 1,457 words]

10:20 pm / 12th September 2022 / security, sql-injection, ai, gpt-3, openai, prompt-engineering, prompt-injection, generative-ai, riley-goodside, llms

s3-credentials: a tool for creating credentials for S3 buckets

I’ve built a command-line tool called s3-credentials to solve a problem that’s been frustrating me for ages: how to quickly and easily create AWS credentials (an access key and secret key) that have permission to read or write from just a single S3 bucket.

[... 1,618 words]

4:02 am / 3rd November 2021 / cli, projects, python, s3, security, s3-credentials

Exploring the SameSite cookie attribute for preventing CSRF

In reading Yan Zhu’s excellent write-up of the JSON CSRF vulnerability she found in OkCupid one thing puzzled me: I was under the impression that browsers these days default to treating cookies as SameSite=Lax, so I would expect attacks like the one Yan described not to work in modern browsers.

[... 2,198 words]

9:09 pm / 3rd August 2021 / cookies, csrf, security, samesite

Weeknotes: New releases across nine different projects

A new release and security patch for Datasette, plus releases of sqlite-utils, datasette-auth-passwords, django-sql-dashboard, datasette-upload-csvs, xml-analyser, datasette-placekey, datasette-mask-columns and db-to-sqlite.

[... 861 words]

2:31 am / 12th June 2021 / projects, security, datasette, weeknotes, sqlite-utils

Datasette 0.51 (plus weeknotes)

I shipped Datasette 0.51 today, with a new visual design, plugin hooks for adding navigation options, better handling of binary data, URL building utility methods and better support for running Datasette behind a proxy. It’s a lot of stuff! Here are the annotated release notes.

[... 2,020 words]

4:22 am / 1st November 2020 / natalie-downe, projects, security, urls, xss, datasette, weeknotes, annotated-release-notes

Weeknotes: datasette-ics, datasette-upload-csvs, datasette-configure-fts, asgi-csrf

I’ve been preparing for the NICAR 2020 Data Journalism conference this week which has lead me into a flurry of activity across a plethora of different projects and plugins.

[... 834 words]

2:27 am / 4th March 2020 / csrf, data-journalism, icalendar, plugins, projects, search, security, datasette, asgi, weeknotes, datasette-cloud

Single sign-on against GitHub using ASGI middleware

I released Datasette 0.29 last weekend, the first version of Datasette to be built on top of ASGI (discussed previously in Porting Datasette to ASGI, and Turtles all the way down).

[... 1,612 words]

1:18 am / 14th July 2019 / github, middleware, projects, security, datasette, asgi

Is there anyway to game unique link verifications? Like when you get sent a link of the form https:/........com/UID=TYYN04001 How would one change the digits to reproduce another working link?

Not if they’ve been implemented correctly.

[... 42 words]

10:33 am / 21st November 2013 / email, security, quora

How could GitHub improve the password security of its users?

By doing exactly what they’re doing already: adding more sophisticated rate limiting, and preventing users from using common weak passwords.

[... 80 words]

5:50 pm / 20th November 2013 / github, open-source, passwords, rate-limiting, security, quora

What steps can I take to protect my data in case my laptop gets stolen?

Set up full drive encryption—that way if someone steals your laptop they won’t be able to access your data without a password.

[... 95 words]

9:51 am / 7th August 2013 / hacking, internet, security, travel, quora

page 1 / 3 next » last »»

Simon Willison’s Weblog