Simon Willison’s Weblog

Subscribe

Quotations tagged ai

Filters: Type: quotation × ai × Sorted by date


Watching in real time as “slop” becomes a term of art. the way that “spam” became the term for unwanted emails, “slop” is going in the dictionary as the term for unwanted AI generated content

@deepfates # 7th May 2024, 3:59 pm

I believe these things:
1. If you use generative tools to produce or modify your images, you have abandoned photointegrity.
2. That’s not always wrong. Sometimes you need an image of a space battle or a Triceratops family or whatever.
3. What is always wrong is using this stuff without disclosing it.

Tim Bray # 4th May 2024, 4:26 pm

I used to have this singular focus on students writing code that they submit, and then I run test cases on the code to determine what their grade is. This is such a narrow view of what it means to be a software engineer, and I just felt that with generative AI, I’ve managed to overcome that restrictive view.

It’s an opportunity for me to assess their learning process of the whole software development [life cycle]—not just code. And I feel like my courses have opened up more and they’re much broader than they used to be. I can make students work on larger and more advanced projects.

Daniel Zingaro # 3rd May 2024, 6:17 pm

AI is the most anthropomorphized technology in history, starting with the name—intelligence—and plenty of other words thrown around the field: learning, neural, vision, attention, bias, hallucination. These references only make sense to us because they are hallmarks of being human. [...]

There is something kind of pathological going on here. One of the most exciting advances in computer science ever achieved, with so many promising uses, and we can’t think beyond the most obvious, least useful application? What, because we want to see ourselves in this technology? [...]

Anthropomorphizing AI not only misleads, but suggests we are on equal footing with, even subservient to, this technology, and there’s nothing we can do about it.

Zach Seward # 2nd May 2024, 7:44 pm

We collaborate with open-source and commercial model providers to bring their unreleased models to community for preview testing.

Model providers can test their unreleased models anonymously, meaning the models’ names will be anonymized. A model is considered unreleased if its weights are neither open, nor available via a public API or service.

LMSYS # 30th April 2024, 8:35 pm

The creator of a model can not ensure that a model is never used to do something harmful – any more so that the developer of a web browser, calculator, or word processor could. Placing liability on the creators of general purpose tools like these mean that, in practice, such tools can not be created at all, except by big businesses with well funded legal teams.

[...]

Instead of regulating the development of AI models, the focus should be on regulating their applications, particularly those that pose high risks to public safety and security. Regulate the use of AI in high-risk areas such as healthcare, criminal justice, and critical infrastructure, where the potential for harm is greatest, would ensure accountability for harmful use, whilst allowing for the continued advancement of AI technology.

Jeremy Howard # 29th April 2024, 4:04 pm

I’ve worked out why I don’t get much value out of LLMs. The hardest and most time-consuming parts of my job involve distinguishing between ideas that are correct, and ideas that are plausible-sounding but wrong. Current AI is great at the latter type of ideas, and I don’t need more of those.

Martin Kleppmann # 27th April 2024, 7:31 pm

It’s very fast to build something that’s 90% of a solution. The problem is that the last 10% of building something is usually the hard part which really matters, and with a black box at the center of the product, it feels much more difficult to me to nail that remaining 10%. With vibecheck, most of the time the results to my queries are great; some percentage of the time they aren’t. Closing that gap with gen AI feels much more fickle to me than a normal engineering problem. It could be that I’m unfamiliar with it, but I also wonder if some classes of generative AI based products are just doomed to mediocrity as a result.

Moxie Marlinspike # 26th April 2024, 9:40 pm

When I said “Send a text message to Julian Chokkattu,” who’s a friend and fellow AI Pin reviewer over at Wired, I thought I’d be asked what I wanted to tell him. Instead, the device simply said OK and told me it sent the words “Hey Julian, just checking in. How’s your day going?” to Chokkattu. I’ve never said anything like that to him in our years of friendship, but I guess technically the AI Pin did do what I asked.

Cherlynn Low # 24th April 2024, 3:07 pm

We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone.

Phi-3 Technical Report # 23rd April 2024, 3 am

I have a child who is also 2e and has been part of the NYC G&T program. We’ve had a positive experience with the citywide program, specifically with the program at The Anderson School.

Meta AI bot, answering a question on a forum # 18th April 2024, 3:34 am

In mid-March, we added this line to our system prompt to prevent Claude from thinking it can open URLs:

“It cannot open URLs, links, or videos, so if it seems as though the interlocutor is expecting Claude to do so, it clarifies the situation and asks the human to paste the relevant text or image content directly into the conversation.”

Alex Albert (Anthropic) # 18th April 2024, 12:22 am

But the reality is that you can’t build a hundred-billion-dollar industry around a technology that’s kind of useful, mostly in mundane ways, and that boasts perhaps small increases in productivity if and only if the people who use it fully understand its limitations.

Molly White # 17th April 2024, 7:53 pm

The saddest part about it, though, is that the garbage books don’t actually make that much money either. It’s even possible to lose money generating your low-quality ebook to sell on Kindle for $0.99. The way people make money these days is by teaching students the process of making a garbage ebook. It’s grift and garbage all the way down — and the people who ultimately lose out are the readers and writers who love books.

Constance Grady # 16th April 2024, 11:31 pm

[On complaints about Claude 3 reduction in quality since launch] The model is stored in a static file and loaded, continuously, across 10s of thousands of identical servers each of which serve each instance of the Claude model. The model file never changes and is immutable once loaded; every shard is loading the same model file running exactly the same software. We haven’t changed the temperature either. We don’t see anywhere where drift could happen. The files are exactly the same as at launch and loaded each time from a frozen pristine copy.

Jason D. Clinton, Anthropic # 15th April 2024, 1:27 am

The language issues are indicative of the bigger problem facing the AI Pin, ChatGPT, and frankly, every other AI product out there: you can’t see how it works, so it’s impossible to figure out how to use it. [...] our phones are constant feedback machines — colored buttons telling us what to tap, instant activity every time we touch or pinch or scroll. You can see your options and what happens when you pick one. With AI, you don’t get any of that. Using the AI Pin feels like wishing on a star: you just close your eyes and hope for the best. Most of the time, nothing happens.

David Pierce # 12th April 2024, 12:39 pm

[on GitHub Copilot] It’s like insisting to walk when you can take a bike. It gets the hard things wrong but all the easy things right, very helpful and much faster. You have to learn what it can and can’t do.

Andrej Karpathy # 11th April 2024, 1:27 am

The challenge [with RAG] is that most corner-cutting solutions look like they’re working on small datasets while letting you pretend that things like search relevance don’t matter, while in reality relevance significantly impacts quality of responses when you move beyond prototyping (whether they’re literally search relevance or are better tuned SQL queries to retrieve more appropriate rows). This creates a false expectation of how the prototype will translate into a production capability, with all the predictable consequences: underestimating timelines, poor production behavior/performance, etc.

Will Larson # 10th April 2024, 11:09 pm

LLMs are like a trained circus bear that can make you porridge in your kitchen. It’s a miracle that it’s able to do it at all, but watch out because no matter how well they can act like a human on some tasks, they’re still a wild animal. They might ransack your kitchen, and they could kill you, accidentally or intentionally!

Alex Komoroske # 2nd April 2024, 3:19 pm

No one wants to build a product on a model that makes things up. The core problem is that GenAI models are not information retrieval systems. They are synthesizing systems, with no ability to discern from the data it’s trained on unless significant guardrails are put in place.

Rumman Chowdhury # 31st March 2024, 9:20 pm

At this point, I’m confident saying that 75% of what generative-AI text and image platforms can do is useless at best and, at worst, actively harmful. Which means that if AI companies want to onboard the millions of people they need as customers to fund themselves and bring about the great AI revolution, they’ll have to perpetually outrun the millions of pathetic losers hoping to use this tech to make a quick buck. Which is something crypto has never been able to do.

In fact, we may have already reached a point where AI images have become synonymous with scams and fraud.

Ryan Broderick # 21st March 2024, 9:49 pm

It’s hard to overstate the value of LLM support when coding for fun in an unfamiliar language. [...] This example is totally trivial in hindsight, but might have taken me a couple mins to figure out otherwise. This is a bigger deal than it seems! Papercuts add up fast and prevent flow. (A lot of being a senior engineer is just being proficient enough to avoid papercuts).

Geoffrey Litt # 18th March 2024, 6:16 pm

One year since GPT-4 release. Hope you all enjoyed some time to relax; it’ll have been the slowest 12 months of AI progress for quite some time to come.

Leopold Aschenbrenner, OpenAI # 16th March 2024, 3:23 pm

The talk track I’ve been using is that LLMs are easy to take to market, but hard to keep in the market long-term. All the hard stuff comes when you move past the demo and get exposure to real users.

And that’s where you find that all the nice little things you got neatly working fall apart. And you need to prompt differently, do different retrieval, consider fine-tuning, redesign interaction, etc. People will treat this stuff differently from “normal” products, creating unique challenges.

Phillip Carter # 13th March 2024, 3:02 pm

In every group I speak to, from business executives to scientists, including a group of very accomplished people in Silicon Valley last night, much less than 20% of the crowd has even tried a GPT-4 class model.

Less than 5% has spent the required 10 hours to know how they tick.

Ethan Mollick # 9th March 2024, 3:55 am

On the zombie edition of the Washington Independent I discovered, the piece I had published more than ten years before was attributed to someone else. Someone unlikely to have ever existed, and whose byline graced an article it had absolutely never written.

[...] Washingtonindependent.com, which I’m using to distinguish it from its namesake, offers recently published, article-like content that does not appear to me to have been produced by human beings. But, if you dig through its news archive, you can find work human beings definitely did produce. I know this because I was one of them.

Spencer Ackerman # 7th March 2024, 2:59 am

If a hard takeoff occurs, and a safe AI is harder to build than an unsafe one, then by opensourcing everything, we make it easy for someone unscrupulous with access to overwhelming amount of hardware to build an unsafe AI, which will experience a hard takeoff.

As we get closer to building AI, it will make sense to start being less open. The Open in OpenAI means that everyone should benefit from the fruits of AI after its built, but it’s totally OK to not share the science (even though sharing everything is definitely the right strategy in the short and possibly medium term for recruitment purposes).

Ilya Sutskever # 6th March 2024, 3:02 am

For the last few years, Meta has had a team of attorneys dedicated to policing unauthorized forms of scraping and data collection on Meta platforms. The decision not to further pursue these claims seems as close to waving the white flag as you can get against these kinds of companies. But why? [...]

In short, I think Meta cares more about access to large volumes of data and AI than it does about outsiders scraping their public data now. My hunch is that they know that any success in anti-scraping cases can be thrown back at them in their own attempts to build AI training databases and LLMs. And they care more about the latter than the former.

Kieran McCarthy # 28th February 2024, 3:15 pm

One consideration is that such a deep ML system could well be developed outside of Google-- at Microsoft, Baidu, Yandex, Amazon, Apple, or even a startup. My impression is that the Translate team experienced this. Deep ML reset the translation game; past advantages were sort of wiped out. Fortunately, Google’s huge investment in deep ML largely paid off, and we excelled in this new game. Nevertheless, our new ML-based translator was still beaten on benchmarks by a small startup. The risk that Google could similarly be beaten in relevance by another company is highlighted by a startling conclusion from BERT: huge amounts of user feedback can be largely replaced by unsupervised learning from raw text. That could have heavy implications for Google.

Eric Lehman, internal Google email in 2018 # 11th February 2024, 10:59 pm

Reality is that LLMs are not AGI -- they’re a big curve fit to a very large dataset. They work via memorization and interpolation. But that interpolative curve can be tremendously useful, if you want to automate a known task that’s a match for its training data distribution.

Memorization works, as long as you don’t need to adapt to novelty. You don’t *need* intelligence to achieve usefulness across a set of known, fixed scenarios.

François Chollet # 10th February 2024, 6:39 am