Overview
Optimize your Catsu usage for performance and cost
Learn how to get the most out of Catsu with these expert tips and recommendations.
Quick Tips
- Batch your requests - Process multiple texts together for better efficiency
- Use async for parallel requests - Leverage
aembed()andasyncio.gather() - Choose the right model - Consider use case, cost, and features
- Track your costs - Monitor token usage and expenses
- Handle rate limits gracefully - Use Catsu's built-in retry logic
Topics
Batch Processing
Optimize throughput with batching
Async Usage
Maximize performance with async methods
Model Selection
Choose the right model for your use case
Cost Tracking
Monitor and optimize costs
Rate Limiting
Handle rate limits effectively
Performance Optimization
Use Batching
# ❌ Inefficient: One request per text
for text in texts:
response = client.embed(model="voyage-3", input=text)
# ✅ Efficient: Batch request
response = client.embed(model="voyage-3", input=texts)Use Async for Parallel Requests
import asyncio
# ✅ Process multiple batches in parallel
async def process_batches():
responses = await asyncio.gather(
client.aembed(model="voyage-3", input=batch1),
client.aembed(model="voyage-3", input=batch2),
client.aembed(model="voyage-3", input=batch3),
)
return responsesCost Optimization
Monitor Usage
response = client.embed(model="voyage-3", input=texts)
print(f"Tokens: {response.usage.tokens}")
print(f"Cost: ${response.usage.cost:.6f}")
print(f"Cost per text: ${response.usage.cost / len(texts):.8f}")Choose Cost-Effective Models
Different providers and models have varying costs:
- Compare pricing on catsu.dev
- Consider lite/small variants for non-critical use cases
- Use Matryoshka dimensions to reduce costs
Quality Optimization
Use input_type Correctly
# For queries
query_emb = client.embed(
model="voyage-3",
input="search query",
input_type="query"
)
# For documents
doc_emb = client.embed(
model="voyage-3",
input="document content",
input_type="document"
)Choose Domain-Specific Models
# For code
code_emb = client.embed(model="voyage-code-3", input="def hello(): pass")
# For finance
finance_emb = client.embed(model="voyage-finance-2", input="Q4 earnings...")
# For legal
law_emb = client.embed(model="voyage-law-2", input="Section 12(a)...")Reliability
Use Automatic Retries
# Configure retry behavior
client = catsu.Client(
max_retries=5, # Retry up to 5 times
timeout=60 # 60 second timeout
)Handle Errors Gracefully
from catsu.exceptions import CatsuError
try:
response = client.embed(model="voyage-3", input="Text")
except CatsuError as e:
# Log error and handle gracefully
print(f"Error: {e}")Next Steps
Explore detailed guides for each topic above to master Catsu usage.