Model Selection

Choosing the right embedding model depends on your specific use case, requirements, and constraints.

Decision Factors

Consider these factors when selecting a model:

Use case - General retrieval, code search, multilingual, domain-specific
Cost - Price per million tokens
Performance - Quality vs speed trade-offs
Features - input_type, dimensions, quantization support
Context length - Maximum input tokens
Latency - Response time requirements

By Use Case

General Text Retrieval

Look for models optimized for semantic search:

# Examples of good choices:
# - voyage-3 (Voyage AI)
# - text-embedding-3-large (OpenAI)
# - embed-v4.0 (Cohere)

Visit catsu.dev to compare current models.

Code Search

Use code-specific models:

# Code-optimized models:
# - voyage-code-3 (Voyage AI)
# - codestral-embed-2505 (Mistral AI)
# - jina-code-embeddings-1.5b (Jina AI)

response = client.embed(
    "voyage-code-3",
    input="def calculate_similarity(a, b): pass"
)

Multilingual Content

Choose models with multilingual support:

# Multilingual options:
# - embed-multilingual-v3.0 (Cohere)
# - voyage-multilingual-2 (Voyage AI)
# - BAAI/bge-m3 (Together AI, DeepInfra)
# - mxbai models (Mixed Bread)

response = client.embed(
    "embed-multilingual-v3.0",
    input="Bonjour le monde"  # French
)

Domain-Specific Tasks

Finance

# Finance-optimized
response = client.embed(
    "voyage-finance-2",
    input="Q4 earnings exceeded analyst expectations..."
)

Legal

# Law-optimized
response = client.embed(
    "voyage-law-2",
    input="Pursuant to Section 12(a) of the statute..."
)

By Cost Sensitivity

Cost-Optimized

For cost-sensitive applications, consider:

Smaller/lite models (lower cost per token)
Providers with competitive pricing
Matryoshka dimensions to reduce storage

# Use smaller dimensions to save on downstream costs
response = client.embed(
    "voyage-3",
    input="Text",
    dimensions=256  # vs 1024
)

Quality-Optimized

For maximum quality:

Larger models from established providers
Full dimensions (no Matryoshka reduction)
Models with high benchmark scores

By Feature Requirements

Need Matryoshka (dimensions)?

# Providers with dimensions support:
# - Voyage AI, Gemini, OpenAI (text-3), Nomic (v1.5)
# - DeepInfra (Qwen3), Mixed Bread

response = client.embed(
    "voyage-3",
    input="Text",
    dimensions=512
)

Need input_type?

# Providers with input_type support:
# - Voyage AI, Cohere, Gemini, Jina AI, Mistral, Nomic, Mixed Bread

response = client.embed(
    "voyage-3",
    input="Query",
    input_type="query"
)

Need Long Context?

For very long inputs:

# Long context models:
# - Jina AI (up to 32,768 tokens)
# - Together AI (up to 32K for some models)
# - Gemini (2048 tokens)

response = client.embed(
    "jina-embeddings-v3",
    input="Very long document..." * 1000
)

Evaluation Strategy

Test multiple models for your specific use case:

def evaluate_models(queries, documents, models_to_test):
    results = {}

    for model in models_to_test:
        # Embed queries
        query_embeddings = client.embed(
            model=model,
            input=queries,
            input_type="query"
        )

        # Embed documents
        doc_embeddings = client.embed(
            model=model,
            input=documents,
            input_type="document"
        )

        # Evaluate retrieval quality (your metrics here)
        # ...

        results[model] = {
            "quality": quality_score,
            "cost": query_embeddings.usage.cost + doc_embeddings.usage.cost,
            "latency": query_embeddings.usage.latency
        }

    return results

# Test and compare
results = evaluate_models(test_queries, test_docs, [
    "voyage-3",
    "text-embedding-3-small",
    "embed-v4.0"
])

Best Practices

Test multiple models on your specific data
Consider the full cost (API + storage + compute)
Balance quality, cost, and latency
Use domain-specific models when available
Check feature support before committing
Monitor performance in production

Next Steps

Models Catalog - Compare all available models
Cost Tracking - Monitor spending
Providers - Explore provider-specific features

Model Selection

On this page