Simon Willison’s Weblog

Subscribe

Items tagged embeddings, generativeai in Oct, 2023

Filters: Year: 2023 × Month: Oct × embeddings × generativeai × Sorted by date


Embeddings: What they are and why they matter

Embeddings are a really neat trick that often come wrapped in a pile of intimidating jargon.

[... 5835 words]

Bottleneck T5 Text Autoencoder (via) Colab notebook by Linus Lee demonstrating his Contra Bottleneck T5 embedding model, which can take up to 512 tokens of text, convert that into a 1024 floating point number embedding vector... and then then reconstruct the original text (or a close imitation) from the embedding again.

This allows for some fascinating tricks, where you can do things like generate embeddings for two completely different sentences and then reconstruct a new sentence that combines the weights from both. # 10th October 2023, 2:12 am

Types

Years

Months

Tags