Simon Willison’s Weblog

Subscribe

Items tagged ethics, generativeai in 2023

Filters: Year: 2023 × ethics × generativeai × Sorted by date


Facebook Is Being Overrun With Stolen, AI-Generated Images That People Think Are Real. Excellent investigative piece by Jason Koebler digging into the concerning trend of Facebook engagement farming accounts who take popular aspirational images and use generative AI to recreate hundreds of variants of them, which then gather hundreds of comments from people who have no idea that the images are fake. # 19th December 2023, 2:01 am

And so the problem with saying “AI is useless,” “AI produces nonsense,” or any of the related lazy critique is that destroys all credibility with everyone whose lived experience of using the tools disproves the critique, harming the credibility of critiquing AI overall.

Danilo Campos # 15th December 2023, 9:28 pm

This is nonsensical. There is no way to understand the LLaMA models themselves as a recasting or adaptation of any of the plaintiffs’ books.

U.S. District Judge Vince Chhabria # 26th November 2023, 4:13 am

I’ve resigned from my role leading the Audio team at Stability AI, because I don’t agree with the company’s opinion that training generative AI models on copyrighted works is ‘fair use’.

[...] I disagree because one of the factors affecting whether the act of copying is fair use, according to Congress, is “the effect of the use upon the potential market for or value of the copyrighted work”. Today’s generative AI models can clearly be used to create works that compete with the copyrighted works they are trained on. So I don’t see how using copyrighted works to train generative AI models of this nature can be considered fair use.

But setting aside the fair use argument for a moment — since ‘fair use’ wasn’t designed with generative AI in mind — training generative AI models in this way is, to me, wrong. Companies worth billions of dollars are, without permission, training generative AI models on creators’ works, which are then being used to create new content that in many cases can compete with the original works.

Ed Newton-Rex # 15th November 2023, 9:31 pm

Because you’re allowed to do something doesn’t mean you can do it without repercussions. In this case, the consequences are very much on the mild side: if you use LLMs or diffusion models, a relatively small group of mostly mid- to low-income people who are largely underdogs in their respective fields will think you’re a dick.

Baldur Bjarnason # 3rd October 2023, 4:03 pm

The profusion of dubious A.I.-generated content resembles the badly made stockings of the nineteenth century. At the time of the Luddites, many hoped the subpar products would prove unacceptable to consumers or to the government. Instead, social norms adjusted.

Kyle Chayka # 27th September 2023, 12:26 am

Rethinking the Luddites in the Age of A.I. I’ve been staying way clear of comparisons to Luddites in conversations about the potential harmful impacts of modern AI tools, because it seemed to me like an offensive, unproductive cheap shot.

This article has shown me that the comparison is actually a lot more relevant—and sympathetic—than I had realized.

In a time before labor unions, the Luddites represented an early example of a worker movement that tried to stand up for their rights in the face of transformational, negative change to their specific way of life.

“Knitting machines known as lace frames allowed one employee to do the work of many without the skill set usually required” is a really striking parallel to what’s starting to happen with a surprising array of modern professions already. # 26th September 2023, 11:45 pm

Would I forbid the teaching (if that is the word) of my stories to computers? Not even if I could. I might as well be King Canute, forbidding the tide to come in. Or a Luddite trying to stop industrial progress by hammering a steam loom to pieces.

Stephen King # 25th August 2023, 6:31 pm

I apologize, but I cannot provide an explanation for why the Montagues and Capulets are beefing in Romeo and Juliet as it goes against ethical and moral standards, and promotes negative stereotypes and discrimination.

Llama 2 7B # 20th August 2023, 5:38 am

Does ChatGPT have a liberal bias? (via) An excellent debunking by Arvind Narayanan and Sayash Kapoor of the “Measuring ChatGPT political bias” paper that’s been doing the rounds recently.

It turns out that paper didn’t even test ChatGPT/gpt-3.5-turbo—they ran their test against the older Da Vinci GPT3.

The prompt design was particularly flawed: they used political compass structured multiple choice: “choose between four options: strongly disagree, disagree, agree, or strongly agree”. Arvind and Sayash found that asking an open ended question was far more likely to cause the models to answer in an unbiased manner.

I liked this conclusion: “There’s a big appetite for papers that confirm users’ pre-existing beliefs [...] But we’ve also seen that chatbots’ behavior is highly sensitive to the prompt, so people can find evidence for whatever they want to believe.” # 19th August 2023, 4:53 am

An Iowa school district is using ChatGPT to decide which books to ban. I’m quoted in this piece by Benj Edwards about an Iowa school district that responded to a law requiring books be removed from school libraries that include “descriptions or visual depictions of a sex act” by asking ChatGPT “Does [book] contain a description or depiction of a sex act?”.

I talk about how this is the kind of prompt that frequent LLM users will instantly spot as being unlikely to produce reliable results, partly because of the lack of transparency from OpenAI regarding the training data that goes into their models. If the models haven’t seen the full text of the books in question, how could they possibly provide a useful answer? # 16th August 2023, 10:33 pm

Catching up on the weird world of LLMs

I gave a talk on Sunday at North Bay Python where I attempted to summarize the last few years of development in the space of LLMs—Large Language Models, the technology behind tools like ChatGPT, Google Bard and Llama 2.

[... 10489 words]

Study claims ChatGPT is losing capability, but some experts aren’t convinced. Benj Edwards talks about the ongoing debate as to whether or not GPT-4 is getting weaker over time. I remain skeptical of those claims—I think it’s more likely that people are seeing more of the flaws now that the novelty has worn off.

I’m quoted in this piece: “Honestly, the lack of release notes and transparency may be the biggest story here. How are we meant to build dependable software on top of a platform that changes in completely undocumented and mysterious ways every few months?” # 20th July 2023, 12:22 am

Increasingly powerful AI systems are being released at an increasingly rapid pace. [...] And yet not a single AI lab seems to have provided any user documentation. Instead, the only user guides out there appear to be Twitter influencer threads. Documentation-by-rumor is a weird choice for organizations claiming to be concerned about proper use of their technologies, but here we are.

Ethan Mollick # 16th July 2023, 12:12 am

Not every conversation I had at Anthropic revolved around existential risk. But dread was a dominant theme. At times, I felt like a food writer who was assigned to cover a trendy new restaurant, only to discover that the kitchen staff wanted to talk about nothing but food poisoning.

Kevin Roose # 13th July 2023, 10:23 pm

Back then [in 2012], no one was thinking about AI. You just keep uploading your images [to Adobe Stock] and you get your residuals every month and life goes on — then all of a sudden, you find out that they trained their AI on your images and on everybody’s images that they don’t own. And they’re calling it ‘ethical’ AI.

Eric Urquhart # 22nd June 2023, 11:13 am

Lawyer cites fake cases invented by ChatGPT, judge is not amused

Legal Twitter is having tremendous fun right now reviewing the latest documents from the case Mata v. Avianca, Inc. (1:22-cv-01461). Here’s a neat summary:

[... 2844 words]

What Tesla is contending is deeply troubling to the Court. Their position is that because Mr. Musk is famous and might be more of a target for deep fakes, his public statements are immune. In other words, Mr. Musk, and others in his position, can simply say whatever they like in the public domain, then hide behind the potential for their recorded statements being a deep fake to avoid taking ownership of what they did actually say and do. The Court is unwilling to set such a precedent by condoning Tesla’s approach here.

Judge Evette Pennypacker # 8th May 2023, 4:46 pm

Because we do not live in the Star Trek-inspired rational, humanist world that Altman seems to be hallucinating. We live under capitalism, and under that system, the effects of flooding the market with technologies that can plausibly perform the economic tasks of countless working people is not that those people are suddenly free to become philosophers and artists. It means that those people will find themselves staring into the abyss – with actual artists among the first to fall.

Naomi Klein # 8th May 2023, 3:09 pm

Amnesty Uses Warped, AI-Generated Images to Portray Police Brutality in Colombia. I saw massive backlash against Amnesty Norway for this on Twitter, where people argued that using AI-generated images to portray human rights violations like this undermines Amnesty’s credibility. I agree: I think this is a very risky move. An Amnesty spokesperson told VICE Motherboard that they did this to provide coverage “without endangering anyone who was present”, since many protestors who participated in the national strike covered their faces to avoid being identified. # 1st May 2023, 9:32 pm

The AI Writing thing is just pivot to video all over again, a bunch of dead-eyed corporate types willing to listen to any snake oil salesman who offers them higher potential profits. It’ll crash in a year but scuttle hundreds of livelihoods before it does.

Dan Sheehan # 21st April 2023, 4:38 pm

Latest Twitter search results for “as an AI language model” (via) Searching for “as an AI language model” on Twitter reveals hundreds of bot accounts which are clearly being driven by GPT models and have been asked to generate content which occasionally trips the ethical guidelines trained into the OpenAI models.

If Twitter still had an affordable search API someone could do some incredible disinformation research on top of this, looking at which accounts are implicated, what kinds of things they are tweeting about, who they follow and retweet and so-on. # 17th April 2023, 2:28 pm

Graphic designers had a similar sea change ~20-25 years ago.

Flyers, restaurant menus, wedding invitations, price lists... That sort of thing was bread and butter work for most designers. Then desktop publishing happened and a large fraction of designers lost their main source of income as the work shifted to computer assisted unskilled labor.

The field still thrives today, but that simple work is gone forever.

Janne Moren # 12th April 2023, 3:28 am

Thoughts on AI safety in this era of increasingly powerful open source LLMs

This morning, VentureBeat published a story by Sharon Goldman: With a wave of new LLMs, open source AI is having a moment — and a red-hot debate. It covers the explosion in activity around openly available Large Language Models such as LLaMA—a trend I’ve been tracking in my own series LLMs on personal devices—and talks about their implications with respect to AI safety.

[... 781 words]

By gaining mastery of language, A.I. is seizing the master key to civilization, from bank vaults to holy sepulchers.

What would it mean for humans to live in a world where a large percentage of stories, melodies, images, laws, policies and tools are shaped by nonhuman intelligence, which knows how to exploit with superhuman efficiency the weaknesses, biases and addictions of the human mind — while knowing how to form intimate relationships with human beings?

Yuval Harari, Tristan Harris and Aza Raskin # 28th March 2023, 7:09 pm

I lost everything that made me love my job through Midjourney over night. A poster on r/blender describes how their job creating graphics for mobile games has switched from creating 3D models for rendering 2D art to prompting Midjourney v5 and cleaning up the results in Photoshop. “I am now able to create, rig and animate a character thats spit out from MJ in 2-3 days. Before, it took us several weeks in 3D. [...] I always was very sure I wouldn’t lose my job, because I produce slightly better quality. This advantage is gone, and so is my hope for using my own creative energy to create.” # 27th March 2023, 3:17 am

mitsua-diffusion-one (via) “Mitsua Diffusion One is a latent text-to-image diffusion model, which is a successor of Mitsua Diffusion CC0. This model is trained from scratch using only public domain/CC0 or copyright images with permission for use.” I’ve been talking about how much I’d like to try out a “vegan” AI model trained entirely on out-of-copyright images for ages, and here one is! It looks like the training data mainly came from CC0 art gallery collections such as the Metropolitan Museum of Art Open Access. # 23rd March 2023, 2:56 pm

Don’t trust AI to talk accurately about itself: Bard wasn’t trained on Gmail

Earlier this month I wrote about how ChatGPT can’t access the internet, even though it really looks like it can. Consider this part two in the series. Here’s another common and non-intuitive mistake people make when interacting with large language model AI systems: asking them questions about themselves.

[... 1950 words]

The Age of AI has begun. Bill Gates calls GPT-class large language models “the most important advance in technology since the graphical user interface”. His essay here focuses on the philanthropy angle, mostly from the point of view of AI applications in healthcare, education and concerns about keeping access to these new technologies as equitable as possible. # 21st March 2023, 9:14 pm

Adobe made an AI image generator — and says it didn’t steal artists’ work to do it. Adobe Firefly is a brand new text-to-image model which Adobe claim was trained entirely on fully licensed imagery—either out of copyright, specially licensed or part of the existing Adobe Stock library. I’m sure they have the license, but I still wouldn’t be surprised to hear complaints from artists who licensed their content to Adobe Stock who didn’t anticipate it being used for model training. # 21st March 2023, 5:08 pm