Skip to content
general artificial-intelligence

Embedding

embedding vector ai semantic-search
Plain English

An embedding converts words, sentences, or documents into a list of numbers that captures their meaning. Similar concepts end up with similar numbers. “dog” and “puppy” would have nearly identical embeddings, while “dog” and “spreadsheet” would be very different. This lets computers compare the meaning of text mathematically: search for similar documents, cluster related topics, or find the most relevant passage to answer a question.

Technical Definition

An embedding is a dense vector representation of data (text, images, audio) in a continuous vector space where geometric proximity corresponds to semantic similarity. A text embedding model maps a string to a fixed-length array of floating-point numbers (typically 256-3072 dimensions).

Properties:

  • Semantic similarity: texts with similar meaning have high cosine similarity between their embeddings
  • Dense: every dimension carries information (vs. sparse representations like bag-of-words where most values are zero)
  • Fixed-size: regardless of input length, the output vector has the same dimensionality

Embedding models:

ModelDimensionsProvider
text-embedding-3-small1536OpenAI
text-embedding-3-large3072OpenAI
embed-v41024Cohere
BGE-large-en-v1.51024BAAI (open-source)
all-MiniLM-L6-v2384Sentence Transformers (open-source)

Similarity metrics:

  • Cosine similarity: measures the angle between two vectors. Range: -1 to 1 (1 = identical meaning).
  • Dot product: similar to cosine but sensitive to vector magnitude.
  • Euclidean distance: straight-line distance between vectors.

Applications:

  • Semantic search: find documents by meaning, not just keyword match
  • RAG: retrieve relevant context for LLM generation
  • Clustering: group similar documents, tickets, or customer feedback
  • Recommendation: “users who liked X also liked Y”
  • Anomaly detection: flag inputs that are far from any known cluster

Creating and comparing embeddings

from openai import OpenAI
import numpy as np

client = OpenAI()

def embed(text: str) -> list[float]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text,
    )
    return response.data[0].embedding

def cosine_similarity(a: list[float], b: list[float]) -> float:
    a, b = np.array(a), np.array(b)
    return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)))

# Compare semantic similarity
e1 = embed("How do I configure a VLAN?")
e2 = embed("Setting up virtual LANs on a switch")
e3 = embed("Best chocolate cake recipe")

print(cosine_similarity(e1, e2))  # ~0.89 (very similar)
print(cosine_similarity(e1, e3))  # ~0.12 (unrelated)
In the Wild

Embeddings are the foundation of modern search and recommendation systems. Every RAG pipeline starts by embedding documents into a vector database. Google Search uses embeddings to understand query intent beyond keyword matching. GitHub Copilot uses code embeddings to find relevant code snippets. In IT operations, embeddings power intelligent log search (“find errors similar to this one”), documentation search, and ticket routing. The choice of embedding model significantly impacts quality: domain-specific models (trained on code, medical text, or legal documents) outperform general-purpose models for specialized tasks. Running embedding models locally (Sentence Transformers, Ollama) is practical on consumer hardware, making privacy-preserving semantic search accessible.