createClient()
Creates a new WebLLM client instance with optional local and cloud configuration.
Signature
function createClient(options?: CreateClientOptions): WebLLMClient;Parameters
options (optional)
Configuration object for the client.
interface CreateClientOptions { local?: LocalConfig; cloud?: CloudConfig; onProgress?: ProgressCallback; onRoute?: RouteCallback; onError?: (error: Error) => void;}local
Configuration for local inference using WebGPU and MLC models. See Config Types for all supported formats.
- Type:
LocalConfig - Default:
undefined(disabled) - Can be:
'auto',false,null, model string, config object, function, or provider instance
cloud
Configuration for cloud inference using OpenAI-compatible APIs. See Config Types for all supported formats.
- Type:
CloudConfig - Default:
undefined - Can be: API key string, config object, custom function, or provider instance
onProgress
Callback invoked during local model loading with progress updates.
- Type:
ProgressCallback - Default:
undefined
type ProgressCallback = (progress: LoadProgress) => void;
interface LoadProgress { stage: 'download' | 'compile' | 'warmup'; progress: number; model: string; bytesLoaded?: number; bytesTotal?: number;}onRoute
Callback invoked on each routing decision, reporting whether a request was routed to local or cloud and why.
- Type:
RouteCallback - Default:
undefined
type RouteCallback = (info: { decision: 'local' | 'cloud'; reason: string }) => void;Example:
const client = createClient({ local: 'auto', cloud: process.env.OPENAI_API_KEY, onRoute: ({ decision, reason }) => { console.log(`Routed to ${decision}: ${reason}`); }});onError
Callback invoked when the local backend fails to initialize (e.g., WebGPU not available, model load failure). Useful for logging or fallback UI.
- Type:
(error: Error) => void - Default:
undefined
Example:
const client = createClient({ local: 'auto', cloud: process.env.OPENAI_API_KEY, onError: (error) => { console.error('Local backend failed:', error.message); // Show fallback UI or switch to cloud-only mode }});Return Value
Returns a WebLLMClient instance with chat completions API and lifecycle methods.
Requirements
At least one of local or cloud must be configured. If both are disabled, the client will throw an error during initialization.
Examples
Local only (auto model selection)
import { createClient } from '@webllm-io/sdk';
const client = createClient({ local: 'auto' });// Uses local inference with auto device-based model selectionCloud only
const client = createClient({ local: false, cloud: process.env.OPENAI_API_KEY});Dual engine with progress tracking
const client = createClient({ local: { tiers: { high: 'Qwen3-8B-q4f16_1-MLC', medium: 'Qwen2.5-3B-Instruct-q4f16_1-MLC', low: 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC' }, useCache: true, useWebWorker: true }, cloud: { baseURL: 'https://api.openai.com/v1', apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini' }, onProgress: (progress) => { console.log(`${progress.stage}: ${Math.round(progress.progress * 100)}%`); }});Custom provider functions
import { mlc, fetchSSE } from '@webllm-io/sdk/providers';
const client = createClient({ local: mlc({ model: 'Qwen3-8B-q4f16_1-MLC', useWebWorker: true }), cloud: fetchSSE({ baseURL: 'https://api.openai.com/v1', apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini' })});Dynamic model selection
const client = createClient({ local: (stats) => { if (stats.vram > 8000) return 'Qwen3-8B-q4f16_1-MLC'; if (stats.vram > 4000) return 'Qwen2.5-3B-Instruct-q4f16_1-MLC'; return 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC'; }, cloud: { baseURL: 'https://api.openai.com/v1', apiKey: process.env.OPENAI_API_KEY, model: 'gpt-4o-mini' }});See Also
- WebLLMClient - Client instance methods
- Chat Completions - Inference API
- Config Types - All configuration options
- Providers (MLC) - Local inference provider
- Providers (Fetch) - Cloud inference provider