Product Release2024-06-27

Anthropic Claude joins as the second LLM provider, with RAG query rewriting

Anthropic's Claude API joins OpenAI as the platform's second LLM provider, wired through the same provider-agnostic AI configuration layer. A RAG query transformer ships alongside it : the transformer generates multiple rephrased variants of the user's question before each retrieval pass, broadening recall against the vector index without changing the question the user sees.

Adding a second LLM provider to a platform is the moment the provider-agnosticism claim is either real or fictional. If the code has quietly coupled to OpenAI's message shapes, error types or function-calling syntax, the second integration surfaces the debt. This release adds Anthropic's Claude API without touching application call sites — the AI configuration layer resolves the provider ; the adapter on the Claude side translates to and from the platform's neutral message model. The RAG query transformer ships in the same window, targeting the most common cause of missed retrievals in production knowledge-base deployments.

Anthropic Claude integration

Credential-store managed API keys. The Anthropic API key is held in the platform's credential store alongside the OpenAI key ; no script sees the raw key value. Rotation is a credential-store operation, not a redeploy.
Content array model translated at the adapter boundary. Anthropic returns responses as a typed content array — text blocks, tool-use blocks, thinking blocks — where OpenAI returns a flat string. The adapter translates this to the platform's neutral message envelope so application code that processes an OpenAI response processes a Claude response identically.
Tool-use protocol aligned. Claude's tool-use message type and OpenAI's function-call type surface through the same agent tool-dispatch contract. The agent loop does not distinguish which provider issued the tool call.

RAG query transformer

Multi-variant retrieval. A single user question may match the vector index well with its original phrasing but poorly with a paraphrase — or vice versa. The query transformer sends the original question to the LLM and requests N rephrased variants. Each variant queries the vector index independently ; the union of retrieved chunks, deduplicated by document identifier, enters the LLM context window.
Recall broadening without precision loss. Chunks retrieved by any variant are candidates ; a distance-threshold filter removes the most distant before context injection. Retrieval recall improves because vocabulary gaps between the user's phrasing and the indexed document are bridged by the rephrased variants.
History-aware rephrasing. The query transformer considers the last N conversation turns when generating variants, so rephrasing resolves against the conversational context rather than the isolated question. N is configurable per deployment.

With two providers behind the AI configuration layer, the multi-provider design is validated in production rather than in theory. The Claude adapter and the query transformer together address the two most common RAG failure modes — provider lock-in and vocabulary mismatch at retrieval time — in a single release window.

See the feature →

← All posts