Jina AI

Jina AI provides multimodal and code-specific embedding models with extremely long context support.

Overview

Models: 6 models (v4 multimodal, v3, code models, v2 variants)
Key Features: Multimodal (text+image), code-optimized, up to 32,768 tokens context
API Docs: Jina AI Embeddings

export JINA_API_KEY="your-jina-api-key"

Parameter	Type	Required	Description
`model`	str	Yes	Model identifier
`input`	str \| List[str]	Yes	Text(s) to embed
`input_type`	str	No	`"query"` or `"document"`
`task`	str	No	`retrieval.query`, `retrieval.passage`, `text-matching`, `classification`, `separation`
`dimensions`	int	No	Output dimensions (model-dependent)
`normalized`	bool	No	L2 normalize embeddings (default: True)
`api_key`	str	No	Override API key

response = client.embed(
    "jina-embeddings-v3",
    input="Hello, Jina!"
)

response = client.embed(
    "jina-embeddings-v4",
    input="Text that could be paired with images"
)

response = client.embed(
    "jina-code-embeddings-1.5b",
    input="def calculate_similarity(a, b): return cosine(a, b)"
)

response = client.embed(
    "jina-embeddings-v3",
    input="Search query",
    task="retrieval.query"
)

# Jina supports up to 32,768 tokens
very_long_text = "..." * 10000

response = client.embed(
    "jina-embeddings-v3",
    input=very_long_text
)