Yesterday, stub vectors. Today, the API contract — the shape of an embedding call. We'll keep using stub vectors in the practice (real embedding models burn quota and the values shift run-to-run), but the interface is the same as any production embedding API:
def embed(text: str) -> list[float]:
"""Return a fixed-length vector for the given text."""
...
vectors = [embed(t) for t in texts]
assert all(len(v) == len(vectors[0]) for v in vectors)Two invariants every embedding API obeys:
And in production, we'd call OpenAI / Cohere / a local model?
Yes. The platform exposes embeddings via the same OpenRouter pipeline as Agent(model) — but for this concept lesson we use a deterministic stub so dimensions and values are predictable. The check-the-shape habit is the takeaway.
Every production embedding API follows the same shape:
response = client.embeddings.create(model="text-embedding-3-small", input=texts)
vectors = [d.embedding for d in response.data]Guarantees:
| Property | Why |
|---|---|
| Length is constant per model | Downstream code (cosine, top-k) assumes equal-length vectors |
| Always float-valued | No nulls — every dimension carries signal |
| Order matches input | vectors[i] corresponds to texts[i] |
Production code asserts the contract because a silently malformed embedding (different length, missing element) corrupts every downstream similarity computation:
vectors = [embed(t) for t in texts]
assert len(vectors) == len(texts), "missing embeddings"
assert all(len(v) == EXPECTED_DIM for v in vectors), "wrong dimension"
assert all(isinstance(x, float) for v in vectors for x in v), "non-float value"This catches an upstream regression (model changed, API broken) at the boundary instead of letting it propagate to a confused similarity score 100 lines later.
We use a deterministic 8-dimensional stub embedder (a hash-based bit vector) to keep the lesson focused on shape verification. The values are produced by a one-line function — same input gives same output, no quota cost, no flakiness — so you can run the assertion and see it pass.
In AI Mastery week-1 lessons, this same stub embedder underpins the chunking, storing, and semantic-search lessons. Once you trust the shape, the production swap is one function definition away.
Yesterday, stub vectors. Today, the API contract — the shape of an embedding call. We'll keep using stub vectors in the practice (real embedding models burn quota and the values shift run-to-run), but the interface is the same as any production embedding API:
def embed(text: str) -> list[float]:
"""Return a fixed-length vector for the given text."""
...
vectors = [embed(t) for t in texts]
assert all(len(v) == len(vectors[0]) for v in vectors)Two invariants every embedding API obeys:
And in production, we'd call OpenAI / Cohere / a local model?
Yes. The platform exposes embeddings via the same OpenRouter pipeline as Agent(model) — but for this concept lesson we use a deterministic stub so dimensions and values are predictable. The check-the-shape habit is the takeaway.
Every production embedding API follows the same shape:
response = client.embeddings.create(model="text-embedding-3-small", input=texts)
vectors = [d.embedding for d in response.data]Guarantees:
| Property | Why |
|---|---|
| Length is constant per model | Downstream code (cosine, top-k) assumes equal-length vectors |
| Always float-valued | No nulls — every dimension carries signal |
| Order matches input | vectors[i] corresponds to texts[i] |
Production code asserts the contract because a silently malformed embedding (different length, missing element) corrupts every downstream similarity computation:
vectors = [embed(t) for t in texts]
assert len(vectors) == len(texts), "missing embeddings"
assert all(len(v) == EXPECTED_DIM for v in vectors), "wrong dimension"
assert all(isinstance(x, float) for v in vectors for x in v), "non-float value"This catches an upstream regression (model changed, API broken) at the boundary instead of letting it propagate to a confused similarity score 100 lines later.
We use a deterministic 8-dimensional stub embedder (a hash-based bit vector) to keep the lesson focused on shape verification. The values are produced by a one-line function — same input gives same output, no quota cost, no flakiness — so you can run the assertion and see it pass.
In AI Mastery week-1 lessons, this same stub embedder underpins the chunking, storing, and semantic-search lessons. Once you trust the shape, the production swap is one function definition away.
Create a free account to get started. Paid plans unlock all tracks.