CatsuCatsu Docs

embed() Method

Generate embeddings synchronously

The embed() method generates embeddings synchronously using the specified model and provider.

Signature

def embed(
    self,
    model: str,
    input: Union[str, List[str]],
    *,
    input_type: Optional[str] = None,
    dimensions: Optional[int] = None,
) -> EmbedResponse

Parameters

Required Parameters

model (str)

The model identifier in provider:model format:

client.embed("openai:text-embedding-3-small", ["Text"])
client.embed("voyageai:voyage-3", ["Text"])
client.embed("cohere:embed-v4.0", ["Text"])

input (str | List[str])

The text(s) to embed:

# Single text
client.embed("openai:text-embedding-3-small", "Single text")

# Multiple texts (batch)
client.embed("openai:text-embedding-3-small", ["Text 1", "Text 2", "Text 3"])

Optional Parameters

input_type (str, optional)

Hint for optimizing embeddings for search:

# For search queries
client.embed(
    "voyageai:voyage-3",
    ["What is machine learning?"],
    input_type="query"
)

# For documents being searched
client.embed(
    "voyageai:voyage-3",
    ["Machine learning is..."],
    input_type="document"
)

dimensions (int, optional)

Output embedding dimensions (for models that support Matryoshka embeddings):

client.embed(
    "openai:text-embedding-3-small",
    ["Text"],
    dimensions=256  # Reduce from default 1536
)

Return Value

Returns an EmbedResponse object:

class EmbedResponse:
    embeddings: List[List[float]]  # List of embedding vectors
    dimensions: int                 # Embedding dimensions
    usage: Usage                    # Usage information

class Usage:
    tokens: int       # Total tokens processed

Examples

Basic Usage

response = client.embed(
    "openai:text-embedding-3-small",
    "Hello, world!"
)

print(response.embeddings[0][:5])  # [0.123, 0.456, ...]
print(response.dimensions)          # 1536
print(response.usage.tokens)        # 2

Batch Processing

texts = [
    "First document",
    "Second document",
    "Third document"
]

response = client.embed(
    "openai:text-embedding-3-small",
    texts
)

# Process each embedding
for i, embedding in enumerate(response.embeddings):
    print(f"Document {i+1}: {len(embedding)} dimensions")

With Options

# Voyage AI with input_type
response = client.embed(
    "voyageai:voyage-3",
    ["What is machine learning?"],
    input_type="query"
)

# OpenAI with reduced dimensions
response = client.embed(
    "openai:text-embedding-3-small",
    ["Sample text"],
    dimensions=256
)

NumPy Integration

response = client.embed(
    "openai:text-embedding-3-small",
    ["Text 1", "Text 2"]
)

# Convert to numpy array
arr = response.to_numpy()
print(arr.shape)  # (2, 1536)

Next Steps

On this page