CatsuCatsu Docs

aembed() Method

Generate embeddings asynchronously for better performance

The aembed() method is the asynchronous version of embed(), enabling parallel requests and better performance.

Signature

async def aembed(
    self,
    model: str,
    input: Union[str, List[str]],
    *,
    input_type: Optional[str] = None,
    dimensions: Optional[int] = None,
) -> EmbedResponse

Parameters and return values are identical to embed().

When to Use Async

Use aembed() when:

  • Processing multiple independent requests
  • Building async applications (FastAPI, async web scrapers, etc.)
  • Need better performance and resource utilization
  • Working with large batches across multiple API calls

Basic Async Usage

import asyncio
from catsu import Client

async def main():
    client = Client()

    response = await client.aembed(
        "openai:text-embedding-3-small",
        "Hello, async world!"
    )

    print(response.embeddings[0][:5])

asyncio.run(main())

Parallel Requests with asyncio.gather()

Process multiple requests concurrently:

import asyncio
from catsu import Client

async def main():
    client = Client()

    # Process 3 requests in parallel
    responses = await asyncio.gather(
        client.aembed("openai:text-embedding-3-small", "Query 1"),
        client.aembed("openai:text-embedding-3-small", "Query 2"),
        client.aembed("openai:text-embedding-3-small", "Query 3"),
    )

    # Access results
    for i, response in enumerate(responses):
        print(f"Query {i+1}: {response.dimensions} dimensions")

asyncio.run(main())

Async Context Manager

import asyncio
from catsu import Client

async def main():
    async with Client() as client:
        response = await client.aembed(
            "openai:text-embedding-3-small",
            "Text with automatic cleanup"
        )
        print(response.embeddings)

asyncio.run(main())

Performance Comparison

import asyncio
import time
from catsu import Client

# Synchronous (sequential)
def sync_process():
    client = Client()
    start = time.time()

    for i in range(10):
        response = client.embed("openai:text-embedding-3-small", f"Text {i}")

    elapsed = time.time() - start
    print(f"Sync: {elapsed:.2f}s")

# Asynchronous (parallel)
async def async_process():
    client = Client()
    start = time.time()

    tasks = [
        client.aembed("openai:text-embedding-3-small", f"Text {i}")
        for i in range(10)
    ]
    responses = await asyncio.gather(*tasks)

    elapsed = time.time() - start
    print(f"Async: {elapsed:.2f}s")  # Much faster!

# Run comparison
sync_process()
asyncio.run(async_process())

FastAPI Integration

from fastapi import FastAPI
from catsu import Client

app = FastAPI()
client = Client()

@app.post("/embed")
async def create_embedding(text: str):
    response = await client.aembed(
        "openai:text-embedding-3-small",
        text
    )

    return {
        "embedding": response.embeddings[0],
        "dimensions": response.dimensions,
        "tokens": response.usage.tokens
    }

Best Practices

  • Use asyncio.gather() for parallel requests
  • Set appropriate timeout for batch operations
  • Use async context managers for automatic cleanup
  • Consider rate limits when processing large batches

Next Steps

On this page