ArchitectureA supervised JVM-class runtime — OLTP on seven engines, OLAP on three. AI-native, MCP-native, observable as plain SQL.Read the architecture
Está viendo la edición Perú. Está viendo la edición Colombia. You're viewing the Pakistan edition. Cambiar a la edición global →Cambiar a la edición global →Switch to the global edition →

HNSW vector indexing on Informix — semantic search at OLTP scale

A Hierarchical Navigable Small World (HNSW) access method on Informix brings approximate-nearest-neighbour vector search to one of the seven supported OLTP engines, with pgvector-compatible function names and dual file / BLOB storage.

The data tier gains a Hierarchical Navigable Small World (HNSW) access method on Informix, bringing approximate-nearest-neighbour vector search to one of the seven supported OLTP engines. The same database that holds the transactional record of truth now indexes the embedding vectors that drive semantic search and retrieval-augmented generation — one engine, two workloads, one operational surface.

Why HNSW

HNSW is a graph-based index optimised for high-dimensional nearest-neighbour queries. It builds a multi-layer proximity graph where each node carries links to its closest neighbours ; queries descend from a sparse top layer to a dense bottom layer in logarithmic time. The trade-off is approximate rather than exact recall, controllable through the standard m, ef_construction and ef_search parameters — recall and latency are tunable against memory.

What ships on the Informix side

  • Native access method. An Informix index type — not an external service — so vector queries plan and execute inside the same query optimiser as the rest of the workload.
  • Dual-backend storage. Vectors persist either inline in the table (file-based) or as a BLOB column (database-based) ; the choice depends on row width, update frequency and the tooling that needs to read the column.
  • Cosine, Euclidean and inner-product distance. Three operators (<=>, <->, <#>) match the pgvector convention, so application code written against PostgreSQL pgvector ports without an SQL rewrite.
  • Dimension support up to 3,072. Sized for OpenAI's text-embedding-3-large at the maximum dimension, with smaller embedding models (768, 1,024, 1,536) covered by construction.

Where it lands in the stack

Embeddings produced by the AI tier (OpenAI, Anthropic, Vertex, Cohere, IBM watsonx) write to a VECTOR column under the HNSW index ; the RAG pipeline queries that column with a cosine-similarity predicate to retrieve the top-k rows for context injection. Because the index is co-located with the transactional rows, the retrieval step joins seamlessly with the row-level security expressions and audit trail the rest of the application carries — a property an external vector service cannot replicate.

The same operational tooling — backups, replication (HDR / SDS / RSS), cluster failover — covers the vector index without a separate runbook. Vector workloads inherit the platform's existing Informix governance rather than introducing a new system to operate.

See the feature →

← All posts