What is an Embedding?
A way to represent text, images, or anything else as numerical vectors so machines can compare meaning.
An embedding is a way to convert something (text, image, audio) into an array of numbers (a vector) such that things with similar meaning end up with similar vectors.
Intuition
Imagine embedding words into a 3D space (real embeddings have ~1536 dimensions):
"dog" → [0.8, 0.2, 0.1]
"cat" → [0.7, 0.3, 0.2] ← close to "dog" (both pets)
"car" → [0.1, 0.9, 0.5] ← far from "dog" (different topic)
The distance between two vectors approximates semantic distance.
What embeddings are used for
1. RAG (Retrieval-Augmented Generation)
You have 1000 pages of docs. You can’t fit them all into a prompt.
- Embed each chunk → store in a vector database
- Embed the question → find chunks with closest vectors
- Pass those chunks to the LLM → get an accurate answer
2. Semantic search
Traditional search matches keywords. Embedding search matches meaning:
- Query “ways to lose weight” also surfaces “fat-burning techniques”
3. Clustering & classification
Embed all customer feedback, group nearby vectors → discover common complaint themes.
4. Recommendations
Products with vectors close to ones a user already bought → suggest those.
Popular embedding models (2026)
| Provider | Model | Dimensions | Price / 1M tokens |
|---|---|---|---|
| OpenAI | text-embedding-3-large | 3072 | $0.13 |
| OpenAI | text-embedding-3-small | 1536 | $0.02 |
| Voyage AI | voyage-3 | 1024 | $0.06 |
| Cohere | embed-v3 | 1024 | $0.10 |
| Open source | bge-m3 | 1024 | Free (self-host) |
Multimodal embeddings
Models like CLIP embed both images and text into the same vector space — letting you search images by text:
- Query “a golden retriever running on a beach” → find matches across millions of photos.