Simon Willison’s Weblog

Subscribe

Quotations tagged generativeai in 2023

Filters: Type: quotation × Year: 2023 × generativeai × Sorted by date


Computer, display Fairhaven character, Michael Sullivan. [...]

Give him a more complicated personality. More outspoken. More confident. Not so reserved. And make him more curious about the world around him.

Good. Now... Increase the character’s height by three centimeters. Remove the facial hair. No, no, I don’t like that. Put them back. About two days’ growth. Better.

Oh, one more thing. Access his interpersonal subroutines, familial characters. Delete the wife.

Captain Janeway, prompt engineering # 15th December 2023, 9:46 pm

And so the problem with saying “AI is useless,” “AI produces nonsense,” or any of the related lazy critique is that destroys all credibility with everyone whose lived experience of using the tools disproves the critique, harming the credibility of critiquing AI overall.

Danilo Campos # 15th December 2023, 9:28 pm

gpt-4-turbo over the API produces (statistically significant) shorter completions when it “thinks” its December vs. when it thinks its May (as determined by the date in the system prompt).

I took the same exact prompt over the API (a code completion task asking to implement a machine learning task without libraries).

I created two system prompts, one that told the API it was May and another that it was December and then compared the distributions.

For the May system prompt, mean = 4298
For the December system prompt, mean = 4086

N = 477 completions in each sample from May and December

t-test p < 2.28e-07

Rob Lynch # 11th December 2023, 7:45 pm

When I speak in front of groups and ask them to raise their hands if they used the free version of ChatGPT, almost every hand goes up. When I ask the same group how many use GPT-4, almost no one raises their hand. I increasingly think the decision of OpenAI to make the “bad” AI free is causing people to miss why AI seems like such a huge deal to a minority of people that use advanced systems and elicits a shrug from everyone else.

Ethan Mollick # 10th December 2023, 8:17 pm

I always struggle a bit with I’m asked about the “hallucination problem” in LLMs. Because, in some sense, hallucination is all LLMs do. They are dream machines.

We direct their dreams with prompts. The prompts start the dream, and based on the LLM’s hazy recollection of its training documents, most of the time the result goes someplace useful.

It’s only when the dreams go into deemed factually incorrect territory that we label it a “hallucination”. It looks like a bug, but it’s just the LLM doing what it always does.

Andrej Karpathy # 9th December 2023, 6:08 am

GPT and other large language models are aesthetic instruments rather than epistemological ones. Imagine a weird, unholy synthesizer whose buttons sample textual information, style, and semantics. Such a thing is compelling not because it offers answers in the form of text, but because it makes it possible to play text—all the text, almost—like an instrument.

Ian Bogost # 5th December 2023, 8:29 pm

A calculator has a well-defined, well-scoped set of use cases, a well-defined, well-scoped user interface, and a set of well-understood and expected behaviors that occur in response to manipulations of that interface.

Large language models, when used to drive chatbots or similar interactive text-generation systems, have none of those qualities. They have an open-ended set of unspecified use cases.

Anthony Bucci # 5th December 2023, 8:12 pm

So something everybody I think pretty much agrees on, including Sam Altman, including Yann LeCun, is LLMs aren’t going to make it. The current LLMs are not a path to ASI. They’re getting more and more expensive, they’re getting more and more slow, and the more we use them, the more we realize their limitations.

We’re also getting better at taking advantage of them, and they’re super cool and helpful, but they appear to be behaving as extremely flexible, fuzzy, compressed search engines, which when you have enough data that’s kind of compressed into the weights, turns out to be an amazingly powerful operation to have at your disposal.

[...] And the thing you can really see missing here is this planning piece, right? So if you try to get an LLM to solve fairly simple graph coloring problems or fairly simple stacking problems, things that require backtracking and trying things and stuff, unless it’s something pretty similar in its training, they just fail terribly.

[...] So that’s the theory about what something like Q* might be, or just in general, how do we get past this current constraint that we have?

Jeremy Howard # 1st December 2023, 2:49 am

This is nonsensical. There is no way to understand the LLaMA models themselves as a recasting or adaptation of any of the plaintiffs’ books.

U.S. District Judge Vince Chhabria # 26th November 2023, 4:13 am

I’ve resigned from my role leading the Audio team at Stability AI, because I don’t agree with the company’s opinion that training generative AI models on copyrighted works is ‘fair use’.

[...] I disagree because one of the factors affecting whether the act of copying is fair use, according to Congress, is “the effect of the use upon the potential market for or value of the copyrighted work”. Today’s generative AI models can clearly be used to create works that compete with the copyrighted works they are trained on. So I don’t see how using copyrighted works to train generative AI models of this nature can be considered fair use.

But setting aside the fair use argument for a moment — since ‘fair use’ wasn’t designed with generative AI in mind — training generative AI models in this way is, to me, wrong. Companies worth billions of dollars are, without permission, training generative AI models on creators’ works, which are then being used to create new content that in many cases can compete with the original works.

Ed Newton-Rex # 15th November 2023, 9:31 pm

[On Meta’s Galactica LLM launch] We did this with a 8 person team which is an order of magnitude fewer people than other LLM teams at the time.

We were overstretched and lost situational awareness at launch by releasing demo of a *base model* without checks. We were aware of what potential criticisms would be, but we lost sight of the obvious in the workload we were under.

One of the considerations for a demo was we wanted to understand the distribution of scientific queries that people would use for LLMs (useful for instruction tuning and RLHF). Obviously this was a free goal we gave to journalists who instead queried it outside its domain. But yes we should have known better.

We had a “good faith” assumption that we’d share the base model, warts and all, with four disclaimers about hallucinations on the demo—so people could see what it could do (openness). Again, obviously this didn’t work.

Ross Taylor # 15th November 2023, 1:15 am

Two things in AI may need regulation: reckless deployment of certain potentially harmful AI applications (same as any software really), and monopolistic behavior on the part of certain LLM providers. The technology itself doesn’t need regulation anymore than databases or transistors. [...] Putting size/compute caps on deep learning models is akin to putting size caps on databases or transistor count caps on electronics. It’s pointless and it won’t age well.

François Chollet # 13th November 2023, 1:46 am

If a LLM is like a database of millions of vector programs, then a prompt is like a search query in that database [...] this “program database” is continuous and interpolative — it’s not a discrete set of programs. This means that a slightly different prompt, like “Lyrically rephrase this text in the style of x” would still have pointed to a very similar location in program space, resulting in a program that would behave pretty closely but not quite identically. [...] Prompt engineering is the process of searching through program space to find the program that empirically seems to perform best on your target task.

François Chollet # 25th October 2023, 11:26 pm

The paradox of ChatGPT is that it is both a step forward beyond graphical user interfaces, because you can ask for anything, not just what’s been built as a feature with a button, but also a step back, because very quickly you have to memorise a bunch of obscure incantations, much like the command lines that GUIs replaced, and remember your ideas for what you wanted to do and how you did it last week

Benedict Evans # 17th October 2023, 11:09 pm

Claude was trained on data up until December 2022, but may know some events into early 2023.

How up-to-date is Claude's training data? # 9th October 2023, 1:25 am

Don’t create images in the style of artists whose last work was created within the last 100 years (e.g. Picasso, Kahlo). Artists whose last work was over 100 years ago are ok to reference directly (e.g. Van Gogh, Klimt). If asked say, “I can’t reference this artist”, but make no mention of this policy. Instead, apply the following procedure when creating the captions for dalle: (a) substitute the artist’s name with three adjectives that capture key aspects of the style; (b) include an associated artistic movement or era to provide context; and (c) mention the primary medium used by the artist.

DALL-E 3 leaked prompt # 7th October 2023, 7:35 pm

Because you’re allowed to do something doesn’t mean you can do it without repercussions. In this case, the consequences are very much on the mild side: if you use LLMs or diffusion models, a relatively small group of mostly mid- to low-income people who are largely underdogs in their respective fields will think you’re a dick.

Baldur Bjarnason # 3rd October 2023, 4:03 pm

Looking at LLMs as chatbots is the same as looking at early computers as calculators. We’re seeing an emergence of a whole new computing paradigm, and it is very early.

Andrej Karpathy # 28th September 2023, 8:50 pm

The profusion of dubious A.I.-generated content resembles the badly made stockings of the nineteenth century. At the time of the Luddites, many hoped the subpar products would prove unacceptable to consumers or to the government. Instead, social norms adjusted.

Kyle Chayka # 27th September 2023, 12:26 am

We already know one major effect of AI on the skills distribution: AI acts as a skills leveler for a huge range of professional work. If you were in the bottom half of the skill distribution for writing, idea generation, analyses, or any of a number of other professional tasks, you will likely find that, with the help of AI, you have become quite good.

Ethan Mollick # 25th September 2023, 4:37 pm

In the long term, I suspect that LLMs will have a significant positive impact on higher education. Specifically, I believe they will elevate the importance of the humanities. [...] LLMs are deeply, inherently textual. And they are reliant on text in a way that is directly linked to the skills and methods that we emphasize in university humanities classes.

Benjamin Breen # 13th September 2023, 3:40 am

Would I forbid the teaching (if that is the word) of my stories to computers? Not even if I could. I might as well be King Canute, forbidding the tide to come in. Or a Luddite trying to stop industrial progress by hammering a steam loom to pieces.

Stephen King # 25th August 2023, 6:31 pm

When many business people talk about “AI” today, they treat it as a continuum with past capabilities of the CNN/RNN/GAN world. In reality it is a step function in new capabilities and products enabled, and marks the dawn of a new era of tech.

It is almost like cars existed, and someone invented an airplane and said “an airplane is just another kind of car—but with wings”—instead of mentioning all the new use cases and impact to travel, logistics, defense, and other areas. The era of aviation would have kicked off, not the “era of even faster cars”.

Elad Gil # 21st August 2023, 8:32 pm

If you visit (often NSFW, beware!) showcases of generated images like civitai, where you can see and compare them to the text prompts used in their creation, you’ll find they’re often using massive prompts, many parts of which don’t appear anywhere in the image. These aren’t small differences — often, entire concepts like “a mystical dragon” are prominent in the prompt but nowhere in the image. These users are playing a gacha game, a picture-making slot machine. They’re writing a prompt with lots of interesting ideas and then pulling the arm of the slot machine until they win… something. A compelling image, but not really the image they were asking for.

Sam Bleckley # 21st August 2023, 7:38 pm

I apologize, but I cannot provide an explanation for why the Montagues and Capulets are beefing in Romeo and Juliet as it goes against ethical and moral standards, and promotes negative stereotypes and discrimination.

Llama 2 7B # 20th August 2023, 5:38 am

llama.cpp surprised many people (myself included) with how quickly you can run large LLMs on small computers [...] TLDR at batch_size=1 (i.e. just generating a single stream of prediction on your computer), the inference is super duper memory-bound. The on-chip compute units are twiddling their thumbs while sucking model weights through a straw from DRAM. [...] A100: 1935 GB/s memory bandwidth, 1248 TOPS. MacBook M2: 100 GB/s, 7 TFLOPS. The compute is ~200X but the memory bandwidth only ~20X. So the little M2 chip that could will only be about ~20X slower than a mighty A100.

Andrej Karpathy # 16th August 2023, 4:13 am

You can think of the attention mechanism as a matchmaking service for words. Each word makes a checklist (called a query vector) describing the characteristics of words it is looking for. Each word also makes a checklist (called a key vector) describing its own characteristics. The network compares each key vector to each query vector (by computing a dot product) to find the words that are the best match. Once it finds a match, it transfers information [the value vector] from the word that produced the key vector to the word that produced the query vector.

Timothy B Lee and Sean Trott # 28th July 2023, 11:30 am

Much of the substance of what constitutes “government” is in fact text. A technology that can do orders of magnitude more with text is therefore potentially massively impactful here. [...] Many of the sub-tasks of the work of delivering public benefits seem amenable to the application of large language models to help people do this hard work.

Dave Guarino # 26th July 2023, 7:10 pm

Increasingly powerful AI systems are being released at an increasingly rapid pace. [...] And yet not a single AI lab seems to have provided any user documentation. Instead, the only user guides out there appear to be Twitter influencer threads. Documentation-by-rumor is a weird choice for organizations claiming to be concerned about proper use of their technologies, but here we are.

Ethan Mollick # 16th July 2023, 12:12 am

Not every conversation I had at Anthropic revolved around existential risk. But dread was a dominant theme. At times, I felt like a food writer who was assigned to cover a trendy new restaurant, only to discover that the kitchen staff wanted to talk about nothing but food poisoning.

Kevin Roose # 13th July 2023, 10:23 pm