214 items tagged “generativeai”
The Changelog podcast: LLMs break the internet
I’m the guest on the latest episode of The Changelog podcast: LLMs break the internet. It’s a follow-up to the episode we recorded six months ago about Stable Diffusion.[... 416 words]
The progress in AI has allowed things like taking down hate speech more efficiently—and this is due entirely to large language models. Because we have large language models [...] we can do a better job than we ever could in detecting hate speech in most languages in the world. That was impossible before.
— Yann LeCun # 7th April 2023, 7:32 pm
For example, if you prompt GPT-3 with “Mary had a,” it usually completes the sentence with “little lamb.” That’s because there are probably thousands of examples of “Mary had a little lamb” in GPT-3’s training data set, making it a sensible completion. But if you add more context in the prompt, such as “In the hospital, Mary had a,” the result will change and return words like “baby” or “series of tests.”
— Benj Edwards # 7th April 2023, 3:36 am
Why ChatGPT and Bing Chat are so good at making things up. I helped review this deep dive by Benj Edwards for Ars Technica into the hallucination/confabulation problem with ChatGPT and other LLMs, which is attracting increasing attention thanks to stories like the recent defamation complaints against ChatGPT. This article explains why this is happening and talks to various experts about potential solutions. # 7th April 2023, 3:33 am
[On AI-assisted programming] I feel like I got a small army of competent hackers to both do my bidding and to teach me as I go. It’s just pure delight and magic.
It’s riding a bike downhill and playing with legos and having a great coach and finishing a project all at once.
— Matt Bateman # 5th April 2023, 11:50 pm
Blinded by Analogies (via) Ethan Mollick discusses how many of the analogies we have for AI right now are hurting rather than helping our understanding, particularly with respect to LLMs. # 5th April 2023, 5 am
My guess is that MidJourney has been doing a massive-scale reinforcement learning from human feedback (“RLHF”)—possibly the largest ever for text-to-image.
When human users choose to upscale an image, it’s because they prefer it over the alternatives. It’d be a huge waste not to use this as a reward signal—cheap to collect, and *exactly* aligned with what your user base wants.
The more users you have, the better RLHF you can do. And then the more users you gain.
— Jim Fan # 5th April 2023, 4:45 am
More capable models can better recognize the specific circumstances under which they are trained. Because of this, they are more likely to learn to act as expected in precisely those circumstances while behaving competently but unexpectedly in others. This can surface in the form of problems that Perez et al. (2022) call sycophancy, where a model answers subjective questions in a way that flatters their user’s stated beliefs, and sandbagging, where models are more likely to endorse common misconceptions when their user appears to be less educated.
— Sam Bowman # 5th April 2023, 3:44 am
Eight Things to Know about Large Language Models (via) This unpublished paper by Samuel R. Bowman is succinct, readable and dense with valuable information to help understand the field of modern LLMs. # 5th April 2023, 3:36 am
Scaling laws allow us to precisely predict some coarse-but-useful measures of how capable future models will be as we scale them up along three dimensions: the amount of data they are fed, their size (measured in parameters), and the amount of computation used to train them (measured in FLOPs). [...] Our ability to make this kind of precise prediction is unusual in the history of software and unusual even in the history of modern AI research. It is also a powerful tool for driving investment since it allows R&D teams to propose model-training projects costing many millions of dollars, with reasonable confidence that these projects will succeed at producing economically valuable systems.
— Sam Bowman # 5th April 2023, 3:32 am
From Deep Learning Foundations to Stable Diffusion. Brand new free online video course from Jeremy Howard: 30 hours of content, covering everything you need to know to implement the Stable Diffusion image generation algorithm from scratch. I previewed parts of this course back in December and it was fascinating: this field is moving so fast that some of the lectures covered papers that had been released just a few days before. # 5th April 2023, 1:13 am
ROOTS search tool (via) BLOOM is one of the most interesting completely openly licensed language models. The ROOTS corpus is the training data that was collected for it, and this tool lets you run searches directly against that corpus. I tried searching for my own name and got an interesting insight into what it knows about me. # 3rd April 2023, 8:40 pm
Closed AI Models Make Bad Baselines (via) The NLP academic research community are facing a tough challenge: the state-of-the-art in large language models, GPT-4, is entirely closed which means papers that compare it to other models lack replicability and credibility. “We make the case that as far as research and scientific publications are concerned, the “closed” models (as defined below) cannot be meaningfully studied, and they should not become a “universal baseline”, the way BERT was for some time widely considered to be.”
Anna Rogers proposes a new rule for this kind of research: “That which is not open and reasonably reproducible cannot be considered a requisite baseline.” # 3rd April 2023, 7:57 pm
Beyond these specific legal arguments, Stability AI may find it has a “vibes” problem. The legal criteria for fair use are subjective and give judges some latitude in how to interpret them. And one factor that likely influences the thinking of judges is whether a defendant seems like a “good actor.” Google is a widely respected technology company that tends to win its copyright lawsuits. Edgier companies like Napster tend not to.
— Timothy B. Lee # 3rd April 2023, 3:38 pm
Stable Diffusion copyright lawsuits could be a legal earthquake for AI. Timothy B. Lee provides a thorough discussion of the copyright lawsuits currently targeting Stable Diffusion and GitHub Copilot, including subtle points about how the interpretation of “fair use” might be applied to the new field of generative AI. # 3rd April 2023, 3:34 pm
Think of language models like ChatGPT as a “calculator for words”
One of the most pervasive mistakes I see people using with large language model tools like ChatGPT is trying to use them as a search engine.[... 1066 words]
What AI can do for you on the Theory of Change podcast
Matthew Sheffield invited me on his show Theory of Change to talk about how AI models like ChatGPT, Bing and Bard work and practical applications of things you can do with them.[... 548 words]
Schillace Laws of Semantic AI (via) Principles for prompt engineering against large language models, developed by Microsoft’s Sam Schillace. # 30th March 2023, 12:20 am
gpt4all. Similar to Alpaca, here’s a project which takes the LLaMA base model and fine-tunes it on instruction examples generated by GPT-3—in this case, it’s 800,000 examples generated using the ChatGPT GPT 3.5 turbo model (Alpaca used 52,000 generated by regular GPT-3). This is currently the easiest way to get a LLaMA derived chatbot running on your own computer: the repo includes compiled binaries for running on M1/M2, Intel Mac, Windows and Linux and provides a link to download the 3.9GB 4-bit quantized model. # 29th March 2023, 6:03 pm
Cerebras-GPT: A Family of Open, Compute-efficient, Large Language Models (via) The latest example of an open source large language model you can run your own hardware. This one is particularly interesting because the entire thing is under the Apache 2 license. Cerebras are an AI hardware company offering a product with 850,000 cores—this release was trained on their hardware, presumably to demonstrate its capabilities. The model comes in seven sizes from 111 million to 13 billion parameters, and the smaller sizes can be tried directly on Hugging Face. # 28th March 2023, 10:05 pm
Announcing Open Flamingo (via) New from LAION: “OpenFlamingo is a framework that enables training and evaluation of large multimodal models (LMMs)”. Multimodal here means it can answer questions about images—their interactive demo includes tools for image captioning, animal recognition, counting objects and visual question answering. Theye’ve released the OpenFlamingo-9B model built on top of LLaMA 7B and CLIP ViT/L-14—the model checkpoint is a 5.24 GB download from Hugging Face, and is available under a non-commercial research license. # 28th March 2023, 9:59 pm
By gaining mastery of language, A.I. is seizing the master key to civilization, from bank vaults to holy sepulchers.
What would it mean for humans to live in a world where a large percentage of stories, melodies, images, laws, policies and tools are shaped by nonhuman intelligence, which knows how to exploit with superhuman efficiency the weaknesses, biases and addictions of the human mind — while knowing how to form intimate relationships with human beings?
— Yuval Harari, Tristan Harris and Aza Raskin # 28th March 2023, 7:09 pm
LLaMA voice chat, with Whisper and Siri TTS. llama.cpp author Georgi Gerganov has stitched together the LLaMA language model, the Whisper voice to text model (with his whisper.cpp library) and the macOS “say” command to create an entirely offline AI agent that he can talk to with his voice and that can speak replies straight back to him. # 27th March 2023, 9:06 pm
Every wave of technological innovation has been unleashed by something costly becoming cheap enough to waste. Software production has been too complex and expensive for too long, which has caused us to underproduce software for decades, resulting in immense, society-wide technical debt. This technical debt is about to contract in a dramatic, economy-wide fashion as the cost and complexity of software production collapses, releasing a wave of innovation.
— Paul Kedrosky and Eric Norlin # 27th March 2023, 5:14 pm
AI-enhanced development makes me more ambitious with my projects
The thing I’m most excited about in our weird new AI-enhanced reality is the way it allows me to be more ambitious with my projects.[... 3327 words]
I think it’s likely that soon all computer users will have the ability to develop small software tools from scratch, and to describe modifications they’d like made to software they’re already using.
— Geoffrey Litt # 27th March 2023, 6:10 am
I lost everything that made me love my job through Midjourney over night. A poster on r/blender describes how their job creating graphics for mobile games has switched from creating 3D models for rendering 2D art to prompting Midjourney v5 and cleaning up the results in Photoshop. “I am now able to create, rig and animate a character thats spit out from MJ in 2-3 days. Before, it took us several weeks in 3D. [...] I always was very sure I wouldn’t lose my job, because I produce slightly better quality. This advantage is gone, and so is my hope for using my own creative energy to create.” # 27th March 2023, 3:17 am
scrapeghost (via) Scraping is a really interesting application for large language model tools like GPT3. James Turk’s scrapeghost is a very neatly designed entrant into this space—it’s a Python library and CLI tool that can be pointed at any URL and given a roughly defined schema (using a neat mini schema language) which will then use GPT3 to scrape the page and try to return the results in the supplied format. # 26th March 2023, 5:29 am
Hello Dolly: Democratizing the magic of ChatGPT with open models. A team at DataBricks applied the same fine-tuning data used by Stanford Alpaca against LLaMA to a much older model—EleutherAI’s GPT-J 6B, first released in May 2021. As with Alpaca, they found that instruction tuning took the raw model—which was extremely difficult to interact with—and turned it into something that felt a lot more like ChatGPT. It’s a shame they reused the license-encumbered 52,000 training samples from Alpaca, but I doubt it will be long before someone recreates a freely licensed alternative to that training set. # 24th March 2023, 5:05 pm
mitsua-diffusion-one (via) “Mitsua Diffusion One is a latent text-to-image diffusion model, which is a successor of Mitsua Diffusion CC0. This model is trained from scratch using only public domain/CC0 or copyright images with permission for use.” I’ve been talking about how much I’d like to try out a “vegan” AI model trained entirely on out-of-copyright images for ages, and here one is! It looks like the training data mainly came from CC0 art gallery collections such as the Metropolitan Museum of Art Open Access. # 23rd March 2023, 2:56 pm