Simon Willison’s Weblog

Subscribe
Atom feed for audio

11 items tagged “audio”

2024

NotebookLM’s automatically generated podcasts are surprisingly effective

Visit NotebookLM's automatically generated podcasts are surprisingly effective

Audio Overview is a fun new feature of Google’s NotebookLM which is getting a lot of attention right now. It generates a one-off custom podcast against content you provide, where two AI hosts start up a “deep dive” discussion about the collected content. These last around ten minutes and are very podcast, with an astonishingly convincing audio back-and-forth conversation.

[... 1,489 words]

2023

Weird A.I. Yankovic, a cursed deep dive into the world of voice cloning. Andy Baio reports back on his investigations into the world of AI voice cloning.

This is no longer a niche interest. There’s a Discord with 500,000 members sharing tips and tricks on cloning celebrity voices in order to make their own cover songs, often built with Google Colab using models distributed through Hugging Face.

Andy then makes his own, playing with the concept “What if every Weird Al song was the original, and every other artist was covering his songs instead?”

I particularly enjoyed Madonna’s cover of “Like A Surgeon”, Lady Gaga’s “Perform This Way” and Lorde’s “Foil”.

# 2nd October 2023, 6:50 pm / audio, andy-baio, generative-ai, ai, huggingface

textra (via) Tiny (432KB) macOS binary CLI tool by Dylan Freedman which produces high quality text extraction from PDFs, images and even audio files using the VisionKit APIs in macOS 13 and higher. It handles handwriting too!

# 23rd March 2023, 9:08 pm / macosx, ocr, pdf, audio

2010

Audio Sprites (and fixes for iOS). Remy Sharp on the limitations of HTML5 audio support in iOS.

# 23rd December 2010, 8:04 pm / audio, html5, ios, remysharp, recovered

ZOMBO.com in HTML5. Uses SVG (scripted by JavaScript) and the audio element. Finally, Zombo.com comes to the iPad.

# 20th May 2010, 3:26 pm / audio, html5, ipad, svg, zombo, zombocom, recovered

Video on the Web—Dive Into HTML5. Everything a web developer needs to know about video containers, video codecs, adio containers, audio codecs, h.264, theora, vorbis, licensing, encoding, batch encoding and the html5 video element.

# 24th March 2010, 12:50 am / theora, h264, video, audio, html5, mark-pilgrim

HTML 5 audio player demo. Scott Andrew’s experiments with the HTML5 audio element (and jQuery)—straight forward and works a treat in Safari, but Firefox doesn’t support MP3. Presumably it’s not too hard to set up a fallback for Ogg.

# 1st February 2010, 9:58 am / mp3, ogg, firefox, safari, html5, audio, scott-andrew, javascript, jquery

2009

Codecs for <audio> and <video>. HTML 5 will not be requiring support for specific audio and video codecs—Ian Hickson explains why, in great detail. Short version: Apple won’t implement Theora due to lack of hardware support and an “uncertain patent landscape”, while open source browsers (Chromium and Mozilla) can’t support H.264 due to the cost of the licenses.

# 2nd July 2009, 10:16 am / h264, video, audio, html5, ian-hickson, theora, ogg, chromium, mozilla, google, patents, codecs

Firefox 3.5 for developers. It’s out today, and the feature list is huge. Highlights include HTML 5 drag ’n’ drop, audio and video elements, offline resources, downloadable fonts, text-shadow, CSS transforms with -moz-transform, localStorage, geolocation, web workers, trackpad swipe events, native JSON, cross-site HTTP requests, text API for canvas, defer attribute for the script element and TraceMonkey for better JS performance!

# 30th June 2009, 6:08 pm / firefox, html5, dragndrop, audio, video, offlineresources, fonts, textshadow, csstransforms, localstorage, geolocation, webworkers, json, crossdomain, canvas, tracemonkey, javascript, performance, browsers, mozilla, firefox35

2007

HTML5 Media Support in WebKit. WebKit continues to lead the pack when it comes to trying out new HTML5 proposals. The new audio and video elements make embedding media easy, and provide a neat listener API for hooking in to “playback ended” events.

# 12th November 2007, 11:21 pm / media, audio, events, html5, osx, safari, video, webkit, javascript

Audio Fingerprinting for Clean Metadata. Last.fm have started using audio fingerprints to help clean up misspelled artists and duplicate track information.

# 13th September 2007, 5:46 pm / lastfm, audio, mp3, metadata, audiofingerprinting