Anthropic: Message Batches (beta) (via) Anthropic now have a batch mode, allowing you to send prompts to Claude in batches which will be processed within 24 hours (though probably much faster than that) and come at a 50% price discount.
This matches the batch models offered by OpenAI and by Google Gemini, both of which also provide a 50% discount.
Update 15th October 2024: Alex Albert confirms that Anthropic batching and prompt caching can be combined:
Don't know if folks have realized yet that you can get close to a 95% discount on Claude 3.5 Sonnet tokens when you combine prompt caching with the new Batches API
Recent articles
- LLM 0.22, the annotated release notes - 17th February 2025
- Run LLMs on macOS using llm-mlx and Apple's MLX framework - 15th February 2025
- URL-addressable Pyodide Python environments - 13th February 2025