Engineering2026-05-19

Embedding pipeline gains provider-agnostic task types, title propagation, and configurable query-transformer history

The embedding pipeline gains three improvements: a provider-agnostic EmbeddingTaskType enum that maps platform-level intent to each provider's vocabulary, title propagation from document metadata into the embedding context, and a configurable query-transformer history window via a new platform setting.

Three targeted improvements address the most common causes of degraded RAG retrieval quality: mismatched embedding task types across providers, missing document context at embedding time, and a fixed query-transformer window that required a code change to adjust retrieval breadth.

Provider-agnostic EmbeddingTaskType

Unified enum, provider-specific translation. Different providers expose different task-type hints — Google uses document/query pairs, Cohere uses search variants, OpenAI has no equivalent. The EmbeddingTaskType enum abstracts this: indexing scripts specify DOCUMENT, query scripts specify QUERY, and the adapter translates to the provider's vocabulary at call time. Changing providers does not require editing task-type strings across the codebase.

Title propagation

Title prepended to document text. When a document carries a metadata title, the title is prepended to the body text before the embedding call, giving the model additional context that improves retrieval precision for short-body documents where the title carries most of the semantic signal.
Configurable via preprocessText hook. Title prepending runs inside the configurable pre-processing stage so corpora where it degrades quality — for example, code snippets — can override the hook to suppress it.

Configurable query-transformer history

chat.assistant.rag.query_transformer.num_queries setting. The query transformer generates multiple rephrased versions of the user's question to improve recall against a vector index. This setting controls how many historical conversation turns the transformer considers when generating variants; raising it improves coherence in long sessions at the cost of a slightly larger transformer prompt.

All three changes are backward-compatible: existing corpora embed and retrieve using the previous defaults, and the new settings take effect only when explicitly configured.

See the feature →

← All posts