Overview

Learn how to get the most out of Catsu with these expert tips and recommendations.

Quick Tips

Batch your requests - Process multiple texts together for better efficiency
Use async for parallel requests - Leverage aembed() and asyncio.gather()
Choose the right model - Consider use case, cost, and features
Track your costs - Monitor token usage and expenses
Handle rate limits gracefully - Use Catsu's built-in retry logic

Topics

Batch Processing

Optimize throughput with batching

Async Usage

Maximize performance with async methods

Model Selection

Choose the right model for your use case

Cost Tracking

Monitor and optimize costs

Rate Limiting

Handle rate limits effectively

Performance Optimization

Use Batching

# ❌ Inefficient: One request per text
for text in texts:
    response = client.embed("voyage-3", text)

# ✅ Efficient: Batch request
response = client.embed("voyage-3", texts)

Use Async for Parallel Requests

import asyncio

# ✅ Process multiple batches in parallel
async def process_batches():
    responses = await asyncio.gather(
        client.aembed("voyage-3", batch1),
        client.aembed("voyage-3", batch2),
        client.aembed("voyage-3", batch3),
    )
    return responses

Cost Optimization

Monitor Usage

response = client.embed("voyage-3", texts)

print(f"Tokens: {response.usage.tokens}")
print(f"Cost: ${response.usage.cost:.6f}")
print(f"Cost per text: ${response.usage.cost / len(texts):.8f}")

Choose Cost-Effective Models

Different providers and models have varying costs:

Compare pricing on catsu.dev
Consider lite/small variants for non-critical use cases
Use Matryoshka dimensions to reduce costs

Quality Optimization

Use input_type Correctly

# For queries
query_emb = client.embed(
    "voyage-3",
    input="search query",
    input_type="query"
)

# For documents
doc_emb = client.embed(
    "voyage-3",
    input="document content",
    input_type="document"
)

Choose Domain-Specific Models

# For code
code_emb = client.embed("voyage-code-3", "def hello(): pass")

# For finance
finance_emb = client.embed("voyage-finance-2", "Q4 earnings...")

# For legal
law_emb = client.embed("voyage-law-2", "Section 12(a)...")

Reliability

Use Automatic Retries

# Configure retry behavior
client = catsu.Client(
    max_retries=5,      # Retry up to 5 times
    timeout=60          # 60 second timeout
)

Handle Errors Gracefully

from catsu.exceptions import CatsuError

try:
    response = client.embed("voyage-3", "Text")
except CatsuError as e:
    # Log error and handle gracefully
    print(f"Error: {e}")

Next Steps

Explore detailed guides for each topic above to master Catsu usage.

Batch Processing

Async Usage

Model Selection

Cost Tracking

Rate Limiting

On this page