CatsuCatsu Docs

Nomic

Nomic embedding provider documentation

Nomic provides embedding models with Matryoshka support and advanced long text handling.

Overview

  • Models: 2 models (nomic-embed-text-v1.5, nomic-embed-text-v1)
  • Key Features: Matryoshka embeddings, long text modes, task-specific optimization
  • API Docs: Nomic Documentation

Environment Variable

export NOMIC_API_KEY="your-nomic-api-key"

Supported Parameters

ParameterTypeRequiredDescription
modelstrYesModel identifier
inputstr | List[str]YesText(s) to embed
input_typestrNo"query" or "document" (mapped to search_*)
task_typestrNosearch_document, search_query, clustering, classification
dimensionsintNo64-768 (v1.5 only, Matryoshka)
long_text_modestrNo"truncate" or "mean" (default: "mean")
max_tokens_per_textintNoDefault: 8192
api_keystrNoOverride API key

Examples

Basic Usage

response = client.embed(
    model="nomic-embed-text-v1.5",
    input="Hello, Nomic!"
)

With input_type

Like Cohere, Nomic maps input_type to search_query/search_document:

# Query
query_response = client.embed(
    model="nomic-embed-text-v1.5",
    input="What is AI?",
    input_type="query"  # → search_query
)

# Document
doc_response = client.embed(
    model="nomic-embed-text-v1.5",
    input="AI is...",
    input_type="document"  # → search_document
)

With Matryoshka Dimensions (v1.5 only)

# Smaller dimensions
response = client.embed(
    model="nomic-embed-text-v1.5",
    input="Sample text",
    dimensions=256  # 64-768 supported
)

print(f"Dimensions: {response.dimensions}")  # 256

With Long Text Handling

# Mean pooling for long texts (default)
response = client.embed(
    model="nomic-embed-text-v1.5",
    input="Very long text..." * 1000,
    long_text_mode="mean"
)

# Truncate long texts
response = client.embed(
    model="nomic-embed-text-v1.5",
    input="Very long text..." * 1000,
    long_text_mode="truncate",
    max_tokens_per_text=4096
)

With Task Type

# Clustering
cluster_response = client.embed(
    model="nomic-embed-text-v1.5",
    input="Group similar documents",
    task_type="clustering"
)

# Classification
classify_response = client.embed(
    model="nomic-embed-text-v1.5",
    input="Classify this text",
    task_type="classification"
)

Model Variants

  • nomic-embed-text-v1.5 - Latest, Matryoshka support (64-768d)
  • nomic-embed-text-v1 - Previous version, fixed 768d

For pricing, visit catsu.dev.

Special Notes

  • input_type mapped to search_query/search_document (like Cohere)
  • ✅ Matryoshka dimensions (64-768) for v1.5 only
  • Long text handling: "mean" (average) or "truncate"
  • Configurable max_tokens_per_text
  • v1 model has fixed 768 dimensions

Next Steps

On this page