Skip to content

createClient()

Creates a new WebLLM client instance with optional local and cloud configuration.

Signature

function createClient(options?: CreateClientOptions): WebLLMClient;

Parameters

options (optional)

Configuration object for the client.

interface CreateClientOptions {
local?: LocalConfig;
cloud?: CloudConfig;
onProgress?: ProgressCallback;
onRoute?: RouteCallback;
onError?: (error: Error) => void;
}

local

Configuration for local inference using WebGPU and MLC models. See Config Types for all supported formats.

  • Type: LocalConfig
  • Default: undefined (disabled)
  • Can be: 'auto', false, null, model string, config object, function, or provider instance

cloud

Configuration for cloud inference using OpenAI-compatible APIs. See Config Types for all supported formats.

  • Type: CloudConfig
  • Default: undefined
  • Can be: API key string, config object, custom function, or provider instance

onProgress

Callback invoked during local model loading with progress updates.

  • Type: ProgressCallback
  • Default: undefined
type ProgressCallback = (progress: LoadProgress) => void;
interface LoadProgress {
stage: 'download' | 'compile' | 'warmup';
progress: number;
model: string;
bytesLoaded?: number;
bytesTotal?: number;
}

onRoute

Callback invoked on each routing decision, reporting whether a request was routed to local or cloud and why.

  • Type: RouteCallback
  • Default: undefined
type RouteCallback = (info: { decision: 'local' | 'cloud'; reason: string }) => void;

Example:

const client = createClient({
local: 'auto',
cloud: process.env.OPENAI_API_KEY,
onRoute: ({ decision, reason }) => {
console.log(`Routed to ${decision}: ${reason}`);
}
});

onError

Callback invoked when the local backend fails to initialize (e.g., WebGPU not available, model load failure). Useful for logging or fallback UI.

  • Type: (error: Error) => void
  • Default: undefined

Example:

const client = createClient({
local: 'auto',
cloud: process.env.OPENAI_API_KEY,
onError: (error) => {
console.error('Local backend failed:', error.message);
// Show fallback UI or switch to cloud-only mode
}
});

Return Value

Returns a WebLLMClient instance with chat completions API and lifecycle methods.

Requirements

At least one of local or cloud must be configured. If both are disabled, the client will throw an error during initialization.

Examples

Local only (auto model selection)

import { createClient } from '@webllm-io/sdk';
const client = createClient({ local: 'auto' });
// Uses local inference with auto device-based model selection

Cloud only

const client = createClient({
local: false,
cloud: process.env.OPENAI_API_KEY
});

Dual engine with progress tracking

const client = createClient({
local: {
tiers: {
high: 'Qwen3-8B-q4f16_1-MLC',
medium: 'Qwen2.5-3B-Instruct-q4f16_1-MLC',
low: 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC'
},
useCache: true,
useWebWorker: true
},
cloud: {
baseURL: 'https://api.openai.com/v1',
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-4o-mini'
},
onProgress: (progress) => {
console.log(`${progress.stage}: ${Math.round(progress.progress * 100)}%`);
}
});

Custom provider functions

import { mlc, fetchSSE } from '@webllm-io/sdk/providers';
const client = createClient({
local: mlc({
model: 'Qwen3-8B-q4f16_1-MLC',
useWebWorker: true
}),
cloud: fetchSSE({
baseURL: 'https://api.openai.com/v1',
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-4o-mini'
})
});

Dynamic model selection

const client = createClient({
local: (stats) => {
if (stats.vram > 8000) return 'Qwen3-8B-q4f16_1-MLC';
if (stats.vram > 4000) return 'Qwen2.5-3B-Instruct-q4f16_1-MLC';
return 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC';
},
cloud: {
baseURL: 'https://api.openai.com/v1',
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-4o-mini'
}
});

See Also