Three targeted improvements address the most common causes of degraded RAG retrieval quality: mismatched embedding task types across providers, missing document context at embedding time, and a fixed query-transformer window that required a code change to adjust retrieval breadth.
Provider-agnostic EmbeddingTaskType
- Unified enum, provider-specific translation. Different providers expose different task-type hints — Google uses document/query pairs, Cohere uses search variants, OpenAI has no equivalent. The
EmbeddingTaskTypeenum abstracts this: indexing scripts specifyDOCUMENT, query scripts specifyQUERY, and the adapter translates to the provider's vocabulary at call time. Changing providers does not require editing task-type strings across the codebase.
Title propagation
- Title prepended to document text. When a document carries a metadata title, the title is prepended to the body text before the embedding call, giving the model additional context that improves retrieval precision for short-body documents where the title carries most of the semantic signal.
- Configurable via
preprocessTexthook. Title prepending runs inside the configurable pre-processing stage so corpora where it degrades quality — for example, code snippets — can override the hook to suppress it.
Configurable query-transformer history
chat.assistant.rag.query_transformer.num_queriessetting. The query transformer generates multiple rephrased versions of the user's question to improve recall against a vector index. This setting controls how many historical conversation turns the transformer considers when generating variants; raising it improves coherence in long sessions at the cost of a slightly larger transformer prompt.
All three changes are backward-compatible: existing corpora embed and retrieve using the previous defaults, and the new settings take effect only when explicitly configured.