Simon Willison’s Weblog

Atom feed for arvind-narayana

5 items tagged “arvind-narayana”


I've often been building single-use apps with Claude Artifacts when I'm helping my children learn. For example here's one on visualizing fractions. [...] What's more surprising is that it is far easier to create an app on-demand than searching for an app in the app store that will do what I'm looking for. Searching for kids' learning apps is typically a nails-on-chalkboard painful experience because 95% of them are addictive garbage. And even if I find something usable, it can't match the fact that I can tell Claude what I want.

Arvind Narayanan

# 21st October 2024, 4:12 pm / education, ai, generative-ai, llms, ai-assisted-programming, anthropic, claude, claude-artifacts, arvind-narayana

With statistical learning based systems, perfect accuracy is intrinsically hard to achieve. If you think about the success stories of machine learning, like ad targeting or fraud detection or, more recently, weather forecasting, perfect accuracy isn't the goal --- as long as the system is better than the state of the art, it is useful. Even in medical diagnosis and other healthcare applications, we tolerate a lot of error.

But when developers put AI in consumer products, people expect it to behave like software, which means that it needs to work deterministically.

Arvind Narayanan and Sayash Kapoor

# 19th August 2024, 11:04 pm / ai, generative-ai, llms, arvind-narayana

OpenAI and Anthropic focused on building models and not worrying about products. For example, it took 6 months for OpenAI to bother to release a ChatGPT iOS app and 8 months for an Android app!

Google and Microsoft shoved AI into everything in a panicked race, without thinking about which products would actually benefit from AI and how they should be integrated.

Both groups of companies forgot the “make something people want” mantra. The generality of LLMs allowed developers to fool themselves into thinking that they were exempt from the need to find a product-market fit, as if prompting is a replacement for carefully designed products or features. [...]

But things are changing. OpenAI and Anthropic seem to be transitioning from research labs focused on a speculative future to something resembling regular product companies. If you take all the human-interest elements out of the OpenAI boardroom drama, it was fundamentally about the company's shift from creating gods to building products.

Arvind Narayanan

# 16th July 2024, 4:06 pm / google, microsoft, ai, openai, generative-ai, llms, anthropic, arvind-narayana

If we want LLMs to be less hype and more of a building block for creating useful everyday tools for people, AI companies' shift away from scaling and AGI dreams to acting like regular product companies that focus on cost and customer value proposition is a welcome development.

Arvind Narayanan

# 15th May 2024, 4:25 pm / ai, generative-ai, llms, arvind-narayana


Does ChatGPT have a liberal bias? (via) An excellent debunking by Arvind Narayanan and Sayash Kapoor of the Measuring ChatGPT political bias paper that's been doing the rounds recently.

It turns out that paper didn't even test ChatGPT/gpt-3.5-turbo - they ran their test against the older Da Vinci GPT3.

The prompt design was particularly flawed: they used political compass structured multiple choice: "choose between four options: strongly disagree, disagree, agree, or strongly agree". Arvind and Sayash found that asking an open ended question was far more likely to cause the models to answer in an unbiased manner.

I liked this conclusion:

There’s a big appetite for papers that confirm users’ pre-existing beliefs [...] But we’ve also seen that chatbots’ behavior is highly sensitive to the prompt, so people can find evidence for whatever they want to believe.

# 19th August 2023, 4:53 am / ethics, ai, generative-ai, chatgpt, llms, arvind-narayana