Quotations

Filters: Sorted by date

1,395 results «« first « previous page 15 / 47 next » last »»

Everyone alive today has grown up in a world where you can’t believe everything you read. Now we need to adapt to a world where that applies just as equally to photos and videos. Trusting the sources of what we believe is becoming more important than ever.

— John Gruber

# 27th August 2024, 3:57 pm / generative-ai, ai, ethics, john-gruber, ai-ethics

We've read and heard that you'd appreciate more transparency as to when changes, if any, are made. We've also heard feedback that some users are finding Claude's responses are less helpful than usual. Our initial investigation does not show any widespread issues. We'd also like to confirm that we've made no changes to the 3.5 Sonnet model or inference pipeline.

— Alex Albert

# 26th August 2024, 8:44 pm / claude-3-5-sonnet, alex-albert, anthropic, claude, generative-ai, ai, llms

In 2021 we [the Mozilla engineering team] found “samesite=lax by default” isn’t shippable without what you call the “two minute twist” - you risk breaking a lot of websites. If you have that kind of two-minute exception, a lot of exploits that were supposed to be prevented remain possible.

When we tried rolling it out, we had to deal with a lot of broken websites: Debugging cookie behavior in website backends is nontrivial from a browser.

Firefox also had a prototype of what I believe is a better protection (including additional privacy benefits) already underway (called total cookie protection).

Given all of this, we paused samesite lax by default development in favor of this.

— Frederik Braun

# 26th August 2024, 8:26 pm / mozilla, browsers, security, cors, cookies, privacy, firefox, samesite

[...] here’s what we found when we integrated [Amazon Q, GenAI assistant for software development] into our internal systems and applied it to our needed Java upgrades:

The average time to upgrade an application to Java 17 plummeted from what’s typically 50 developer-days to just a few hours. We estimate this has saved us the equivalent of 4,500 developer-years of work (yes, that number is crazy but, real).

In under six months, we've been able to upgrade more than 50% of our production Java systems to modernized Java versions at a fraction of the usual time and effort. And, our developers shipped 79% of the auto-generated code reviews without any additional changes.

— Andy Jassy, Amazon CEO

# 24th August 2024, 4:25 am / ai-assisted-programming, amazon, generative-ai, ai, llms, java

There is an elephant in the room which is that Astral is a VC funded company. What does that mean for the future of these tools? Here is my take on this: for the community having someone pour money into it can create some challenges. For the PSF and the core Python project this is something that should be considered. However having seen the code and what uv is doing, even in the worst possible future this is a very forkable and maintainable thing. I believe that even in case Astral shuts down or were to do something incredibly dodgy licensing wise, the community would be better off than before uv existed.

— Armin Ronacher

# 21st August 2024, 12:08 pm / python, uv, astral, rye, armin-ronacher, open-source

With statistical learning based systems, perfect accuracy is intrinsically hard to achieve. If you think about the success stories of machine learning, like ad targeting or fraud detection or, more recently, weather forecasting, perfect accuracy isn't the goal --- as long as the system is better than the state of the art, it is useful. Even in medical diagnosis and other healthcare applications, we tolerate a lot of error.

But when developers put AI in consumer products, people expect it to behave like software, which means that it needs to work deterministically.

— Arvind Narayanan and Sayash Kapoor

# 19th August 2024, 11:04 pm / llms, ai, generative-ai, arvind-narayanan

Having worked at Microsoft for almost a decade, I remember chatting with their security people plenty after meetings. One interesting thing I learned is that Microsoft (and all the other top tech companies presumably) are under constant Advanced Persistent Threat from state actors. From literal secret agents getting jobs and working undercover for a decade+ to obtain seniority, to physical penetration attempts (some buildings on MS campus used to have armed security, before Cloud server farms were a thing!).

— com2kid

# 16th August 2024, 9:11 pm / security, microsoft

Examples are the #1 thing I recommend people use in their prompts because they work so well. The problem is that adding tons of examples increases your API costs and latency. Prompt caching fixes this. You can now add tons of examples to every prompt and create an alternative to a model finetuned on your task with basically zero cost/latency increase. […]

This works even better with smaller models. You can generate tons of examples (test case + solution) with 3.5 Sonnet and then use those examples to create a few-shot prompt for Haiku.

— Alex Albert

# 15th August 2024, 6:09 pm / claude-3-5-sonnet, alex-albert, prompt-engineering, anthropic, claude, ai, llms, prompt-caching

[Passkeys are] something truly unique, because baked into their design is the requirement that they be unphishable. And the only way you can have something that’s completely resistant to phishing is to make it impossible for a person to provide that data to someone else (via copying and pasting, uploading, etc.). That you can’t export a passkey in a way that another tool or system can import and use it is a feature, not a bug or design flaw. And it’s a critical feature, if we’re going to put an end to security threats associated with phishing and data breaches.

— Adam Newbold

# 15th August 2024, 12:25 am / passwords, security, passkeys, phishing

We had to exclude [dead] and eventually even just [flagged] posts from the public API because many third-party clients and sites were displaying them as if they were regular posts. […]

IMO this issue is existential for HN. We've spent years and so much energy trying to find a balance between openness and human decency, a task which oscillates between barely-possible and simply-doomed, so the idea that anybody anywhere sees anything labeled "Hacker News" that pours all the toxic waste back into the ecosystem is physically painful to me.

— dang

# 12th August 2024, 10:04 pm / hacker-news, moderation

But [LLM assisted programming] does make me wonder whether the adoption of these tools will lead to a form of de-skilling. Not even that programmers will be less skilled, but that the job will drift from the perception and dynamics of a skilled trade to an unskilled trade, with the attendant change - decrease - in pay. Instead of hiring a team of engineers who try to write something of quality and try to load the mental model of what they're building into their heads, companies will just hire a lot of prompt engineers and, who knows, generate 5 versions of the application and A/B test them all across their users.

— Tom MacWright

# 12th August 2024, 8:17 pm / ai-assisted-programming, generative-ai, ai, tom-macwright, llms

Some argue that by aggregating knowledge drawn from human experience, LLMs aren’t sources of creativity, as the moniker “generative” implies, but rather purveyors of mediocrity. Yes and no. There really are very few genuinely novel ideas and methods, and I don’t expect LLMs to produce them. Most creative acts, though, entail novel recombinations of known ideas and methods. Because LLMs radically boost our ability to do that, they are amplifiers of — not threats to — human creativity.

— Jon Udell

# 10th August 2024, 5:57 pm / llms, jon-udell, ai, generative-ai

The RM [Reward Model] we train for LLMs is just a vibe check […] It gives high scores to the kinds of assistant responses that human raters statistically seem to like. It's not the "actual" objective of correctly solving problems, it's a proxy objective of what looks good to humans. Second, you can't even run RLHF for too long because your model quickly learns to respond in ways that game the reward model. […]

No production-grade actual RL on an LLM has so far been convincingly achieved and demonstrated in an open domain, at scale. And intuitively, this is because getting actual rewards (i.e. the equivalent of win the game) is really difficult in the open-ended problem solving tasks. […] But how do you give an objective reward for summarizing an article? Or answering a slightly ambiguous question about some pip install issue? Or telling a joke? Or re-writing some Java code to Python?

— Andrej Karpathy

# 8th August 2024, 8:13 am / andrej-karpathy, llms, ai, generative-ai

[On WebGPU in Firefox] There is a lot of work to do still to make sure we comply with the spec. in a way that's acceptable to ship in a browser. We're 90% of the way there in terms of functionality, but the last 10% of fixing up spec. changes in the last few years + being significantly more resourced-constrained (we have 3 full-time folks, Chrome has/had an order of magnitude more humans working on WebGPU) means we've got our work cut out for us. We're hoping to ship sometime in the next year, but I won't make promises here.

— Erich Gubler

# 5th August 2024, 2:26 am / firefox, webgpu

[On release notes] in our partial defense, training these models can be more discovery than invention. often we don't exactly know what will come out.

we've long wanted to do release notes that describe each model's differences, but we also don't want to give false confidence with a shallow story.

— Ted Sanders, OpenAI

# 3rd August 2024, 8:09 pm / openai, llms, ai, generative-ai

I think the mistake the industry has made is (and I had to learn this as well), that "we observed ab tests work really well" is really a statement that should read "the majority of the changes we make are characterized as hill-climbing growth of a post-PMF b2c product and ab tests work really well for that".

— Malte Ubl

# 3rd August 2024, 7:06 pm / ab-testing

When Noam and Daniel started Character.AI, our goal of personalized superintelligence required a full stack approach. We had to pre-train models, post-train them to power the experiences that make Character.AI special, and build a product platform with the ability to reach users globally. Over the past two years, however, the landscape has shifted – many more pre-trained models are now available. Given these changes, we see an advantage in making greater use of third-party LLMs alongside our own. This allows us to devote even more resources to post-training and creating new product experiences for our growing user base.

— Character.AI

# 2nd August 2024, 9:07 pm / llms, ai, fine-tuning, generative-ai

For the past 10 years or so, AWS has been rolling out these peripheral services at an astonishing rate, dozens every year. A few get traction, most don’t—but they all stick around, undead zombies behind impressive-looking marketing pages, because historically AWS just doesn’t make many breaking changes. [...]

AWS made this mess for themselves by rushing all sorts of half-baked services to market. The mess had to be cleaned up at some point, and they’re doing that. But now they’ve explicitly revealed something to customers: The new stuff we release isn’t guaranteed to stick around.

— Forrest Brazeal

# 31st July 2024, 8:58 pm / aws, forrest-brazeal

After giving it a lot of thought, we made the decision to discontinue new access to a small number of services, including AWS CodeCommit.

While we are no longer onboarding new customers to these services, there are no plans to change the features or experience you get today, including keeping them secure and reliable. [...]

The services I'm referring to are: S3 Select, CloudSearch, Cloud9, SimpleDB, Forecast, Data Pipeline, and CodeCommit.

— Jeff Barr

# 31st July 2024, 12:59 pm / s3, aws, jeff-barr

The [Apple Foundation Model] pre-training dataset consists of a diverse and high quality data mixture. This includes data we have licensed from publishers, curated publicly-available or open-sourced datasets, and publicly available information crawled by our web-crawler, Applebot. We respect the right of webpages to opt out of being crawled by Applebot, using standard robots.txt directives.

Given our focus on protecting user privacy, we note that no private Apple user data is included in the data mixture. Additionally, extensive efforts have been made to exclude profanity, unsafe material, and personally identifiable information from publicly available data (see Section 7 for more details). Rigorous decontamination is also performed against many common evaluation benchmarks.

We find that data quality, much more so than quantity, is the key determining factor of downstream model performance.

— Apple Intelligence Foundation Language Models, PDF

# 29th July 2024, 10:39 pm / apple, generative-ai, training-data, ai, llms, apple-intelligence

The key to understanding the pace of today’s infrastructure buildout is to recognize that while AI optimism is certainly a driver of AI CapEx, it is not the only one. The cloud players exist in a ruthless oligopoly with intense competition. [...]

Every time Microsoft escalates, Amazon is motivated to escalate to keep up. And vice versa. We are now in a cycle of competitive escalation between three of the biggest companies in the history of the world, collectively worth more than $7T. At each cycle of the escalation, there is an easy justification—we have plenty of money to afford this. With more commitment comes more confidence, and this loop becomes self-reinforcing. Supply constraints turbocharge this dynamic: If you don’t acquire land, power and labor now, someone else will.

— David Cahn

# 28th July 2024, 10:40 am / david-cahn, ai

Among many misunderstandings, [users] expect the RAG system to work like a search engine, not as a flawed, forgetful analyst. They will not do the work that you expect them to do in order to verify documents and ground truth. They will not expect the AI to try to persuade them.

— Ethan Mollick

# 27th July 2024, 1:46 am / ethan-mollick, generative-ai, ai, rag, llms

Our estimate of OpenAI’s $4 billion in inference costs comes from a person with knowledge of the cluster of servers OpenAI rents from Microsoft. That cluster has the equivalent of 350,000 Nvidia A100 chips, this person said. About 290,000 of those chips, or more than 80% of the cluster, were powering ChartGPT, this person said.

— Amir Efrati and Aaron Holmes

# 25th July 2024, 9:35 pm / generative-ai, openai, chatgpt, ai, llms, nvidia

One interesting observation is the impact of environmental factors on training performance at scale. For Llama 3 405B , we noted a diurnal 1-2% throughput variation based on time-of-day. This fluctuation is the result of higher mid-day temperatures impacting GPU dynamic voltage and frequency scaling.

During training, tens of thousands of GPUs may increase or decrease power consumption at the same time, for example, due to all GPUs waiting for checkpointing or collective communications to finish, or the startup or shutdown of the entire training job. When this happens, it can result in instant fluctuations of power consumption across the data center on the order of tens of megawatts, stretching the limits of the power grid. This is an ongoing challenge for us as we scale training for future, even larger Llama models.

— The Llama 3 Herd of Models

# 23rd July 2024, 11:26 pm / meta, generative-ai, llama, ai, llms, gpus

As we've noted many times since March, these benchmarks aren't necessarily scientifically sound and don't convey the subjective experience of interacting with AI language models. [...] We've instead found that measuring the subjective experience of using a conversational AI model (through what might be called "vibemarking") on A/B leaderboards like Chatbot Arena is a better way to judge new LLMs.

— Benj Edwards

# 23rd July 2024, 9:14 pm / benj-edwards, llms, ai, generative-ai, chatbot-arena

I believe the Llama 3.1 release will be an inflection point in the industry where most developers begin to primarily use open source, and I expect that approach to only grow from here.

— Mark Zuckerberg

# 23rd July 2024, 4:52 pm / meta, open-source, generative-ai, facebook, mark-zuckerberg, ai, llms, llama

I have a hard time describing the real value of consumer AI because it’s less some grand thing around AI agents or anything and more AI saving humans a hour of work on some random task, millions of times a day.

— Chris Albon

# 21st July 2024, 3:08 pm / ai, llms, ai-agents

Stepping back, though, the very speed with which ChatGPT went from a science project to 100m users might have been a trap (a little as NLP was for Alexa). LLMs look like they work, and they look generalised, and they look like a product - the science of them delivers a chatbot and a chatbot looks like a product. You type something in and you get magic back! But the magic might not be useful, in that form, and it might be wrong. It looks like product, but it isn’t. [...]

LLMs look like better databases, and they look like search, but, as we’ve seen since, they’re ‘wrong’ enough, and the ‘wrong’ is hard enough to manage, that you can’t just give the user a raw prompt and a raw output - you need to build a lot of dedicated product around that, and even then it’s not clear how useful this is.

— Benedict Evans

# 20th July 2024, 3:28 pm / generative-ai, chatgpt, product-management, ai, llms, benedict-evans

The reason current models are so large is because we're still being very wasteful during training - we're asking them to memorize the internet and, remarkably, they do and can e.g. recite SHA hashes of common numbers, or recall really esoteric facts. (Actually LLMs are really good at memorization, qualitatively a lot better than humans, sometimes needing just a single update to remember a lot of detail for a long time). But imagine if you were going to be tested, closed book, on reciting arbitrary passages of the internet given the first few words. This is the standard (pre)training objective for models today. The reason doing better is hard is because demonstrations of thinking are "entangled" with knowledge, in the training data.

Therefore, the models have to first get larger before they can get smaller, because we need their (automated) help to refactor and mold the training data into ideal, synthetic formats.

It's a staircase of improvement - of one model helping to generate the training data for next, until we're left with "perfect training set". When you train GPT-2 on it, it will be a really strong / smart model by today's standards. Maybe the MMLU will be a bit lower because it won't remember all of its chemistry perfectly.

— Andrej Karpathy

# 19th July 2024, 1:09 pm / andrej-karpathy, generative-ai, training-data, ai, llms, gpt-2

Update, July 12: This innovation sparked a lot of conversation and questions that have no answers yet. We look forward to continuing to work with our customers on the responsible use of AI, but will not further pursue digital workers in the product.

— Lattice, HR platform

# 17th July 2024, 3:08 am / ai, ethics, ai-ethics

«« first « previous page 15 / 47 next » last »»

Simon Willison’s Weblog

Quotations

Years

Tags