general artificial-intelligence

Embedding

Updated April 15, 2026

embedding vector ai semantic-search

Plain English

An embedding converts words, sentences, or documents into a list of numbers that captures their meaning. Similar concepts end up with similar numbers. “dog” and “puppy” would have nearly identical embeddings, while “dog” and “spreadsheet” would be very different. This lets computers compare the meaning of text mathematically: search for similar documents, cluster related topics, or find the most relevant passage to answer a question.

Technical Definition

An embedding is a dense vector representation of data (text, images, audio) in a continuous vector space where geometric proximity corresponds to semantic similarity. A text embedding model maps a string to a fixed-length array of floating-point numbers (typically 256-3072 dimensions).

Properties:

Semantic similarity: texts with similar meaning have high cosine similarity between their embeddings
Dense: every dimension carries information (vs. sparse representations like bag-of-words where most values are zero)
Fixed-size: regardless of input length, the output vector has the same dimensionality

Embedding models:

Model	Dimensions	Provider
text-embedding-3-small	1536	OpenAI
text-embedding-3-large	3072	OpenAI
embed-v4	1024	Cohere
BGE-large-en-v1.5	1024	BAAI (open-source)
all-MiniLM-L6-v2	384	Sentence Transformers (open-source)

Similarity metrics:

Cosine similarity: measures the angle between two vectors. Range: -1 to 1 (1 = identical meaning).
Dot product: similar to cosine but sensitive to vector magnitude.
Euclidean distance: straight-line distance between vectors.

Applications:

Semantic search: find documents by meaning, not just keyword match
RAG: retrieve relevant context for LLM generation
Clustering: group similar documents, tickets, or customer feedback
Recommendation: “users who liked X also liked Y”
Anomaly detection: flag inputs that are far from any known cluster

Creating and comparing embeddings

from openai import OpenAI
import numpy as np

client = OpenAI()

def embed(text: str) -> list[float]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text,
    )
    return response.data[0].embedding

def cosine_similarity(a: list[float], b: list[float]) -> float:
    a, b = np.array(a), np.array(b)
    return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)))

# Compare semantic similarity
e1 = embed("How do I configure a VLAN?")
e2 = embed("Setting up virtual LANs on a switch")
e3 = embed("Best chocolate cake recipe")

print(cosine_similarity(e1, e2))  # ~0.89 (very similar)
print(cosine_similarity(e1, e3))  # ~0.12 (unrelated)

In the Wild

Embeddings are the foundation of modern search and recommendation systems. Every RAG pipeline starts by embedding documents into a vector database. Google Search uses embeddings to understand query intent beyond keyword matching. GitHub Copilot uses code embeddings to find relevant code snippets. In IT operations, embeddings power intelligent log search (“find errors similar to this one”), documentation search, and ticket routing. The choice of embedding model significantly impacts quality: domain-specific models (trained on code, medical text, or legal documents) outperform general-purpose models for specialized tasks. Running embedding models locally (Sentence Transformers, Ollama) is practical on consumer hardware, making privacy-preserving semantic search accessible.

Related Terms

RAGRAG→Large Language ModelLLM→Token→

← Back to Dictionary