Skip to content

Structured Output

The withJsonOutput() helper configures a chat completion request for JSON structured output. It works with both local (MLC) and cloud providers.

Usage

import { createClient } from '@webllm-io/sdk';
import { withJsonOutput } from '@webllm-io/sdk';
const client = createClient({
local: 'auto',
cloud: { baseURL: 'https://api.example.com/v1', apiKey: 'sk-...' },
});
const res = await client.chat.completions.create(
withJsonOutput({
messages: [
{
role: 'user',
content: 'Return a JSON object with fields: name (string), age (number)',
},
],
}),
);
const data = JSON.parse(res.choices[0].message.content);
console.log(data.name, data.age);

How It Works

withJsonOutput() sets response_format: { type: 'json_object' } on the request object. This is supported by both the MLC engine (local) and OpenAI-compatible cloud APIs.

function withJsonOutput<T extends ChatCompletionRequest>(req: T): T {
return {
...req,
response_format: { type: 'json_object' },
};
}

With Streaming

Structured output works with streaming too. Accumulate chunks and parse the final result:

const stream = client.chat.completions.create(
withJsonOutput({
messages: [{ role: 'user', content: 'Return JSON: { "items": ["a","b","c"] }' }],
stream: true,
}),
);
let content = '';
for await (const chunk of stream) {
content += chunk.choices[0]?.delta?.content ?? '';
}
const result = JSON.parse(content);

Tips

  • Always instruct the model to return JSON in the message content — response_format alone may not be sufficient for all models.
  • Wrap JSON.parse() in a try/catch — models can occasionally produce invalid JSON.
  • Smaller local models (grade C devices) may produce less reliable JSON. Consider using cloud fallback for critical structured output tasks.