Simon Willison’s Weblog

Subscribe

23rd October 2024

TIL Running prompts against images, PDFs, audio and video with Google Gemini — I'm still working towards adding multi-modal support to my [LLM](https://llm.datasette.io/) tool. In the meantime, here are notes on running prompts against images and PDFs and audio and video files from the command-line using the [Google Gemini](https://ai.google.dev/gemini-api) family of models.

This is a beat by Simon Willison, posted on 23rd October 2024.

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe