IBM watsonx joins OpenAI, Anthropic Claude, Google Vertex and Cohere as a first-class LLM provider in the platform's unified AI configuration layer. Provider choice remains configuration — the same application code that talks to OpenAI talks to watsonx, with the active model declared on a per-environment basis rather than baked into the call site.
IAM token refresh
watsonx authenticates with IAM-issued bearer tokens that expire on a one-hour cadence ; production workloads that out-live a single token need a refresh path that does not introduce latency on the request path. The Watson client handles this transparently :
- API key stored at construction. The long-lived API key is captured when the client is instantiated and held in the credential store, never on the call stack.
- Refresh within five minutes of expiry. The next request inside the refresh window triggers a fresh-token fetch ; subsequent requests in the same window see the new token immediately.
- Double-checked locking under concurrent load. Multiple threads hitting the refresh window simultaneously coordinate so only one IAM call is issued ; the other threads block briefly, then proceed with the new token.
Deployments that supply a pre-obtained bearer token rather than an API key are unaffected — the client treats the token as static and skips the refresh path.