Cache Management
WebLLM.io provides two utilities for managing the local model cache stored in OPFS (Origin Private File System). These functions delegate to @mlc-ai/web-llm’s cache management system and are only available when the MLC peer dependency is installed.
Functions
hasModelInCache()
Check if a specific model is cached in OPFS.
Signature
async function hasModelInCache(modelId: string): Promise<boolean>;Parameters
modelId- MLC model identifier (e.g.,'Llama-3.1-8B-Instruct-q4f16_1-MLC')
Return Value
- Returns
trueif the model is fully cached in OPFS - Returns
falseif:- The model is not cached
- The model is partially cached (incomplete download)
@mlc-ai/web-llmis not installed
Examples
import { hasModelInCache } from '@webllm-io/sdk';
// Check if model is cached before loadingconst modelId = 'Llama-3.1-8B-Instruct-q4f16_1-MLC';const isCached = await hasModelInCache(modelId);
if (isCached) { console.log('Model is cached, loading will be fast');} else { console.log('Model will be downloaded (~4-8GB)');}// Show cache status in UIasync function updateCacheStatus() { const models = [ 'Llama-3.1-8B-Instruct-q4f16_1-MLC', 'Qwen2.5-3B-Instruct-q4f16_1-MLC', 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC' ];
for (const model of models) { const cached = await hasModelInCache(model); console.log(`${model}: ${cached ? '✓ Cached' : '✗ Not cached'}`); }}deleteModelFromCache()
Remove a specific model from OPFS cache to free up storage.
Signature
async function deleteModelFromCache(modelId: string): Promise<void>;Parameters
modelId- MLC model identifier to delete
Return Value
Returns a Promise that resolves when deletion completes.
Errors
- Throws if
@mlc-ai/web-llmis not installed - Throws if OPFS access fails (e.g., browser doesn’t support OPFS)
- Silently succeeds if model is not cached
Examples
import { deleteModelFromCache } from '@webllm-io/sdk';
// Delete a specific modelawait deleteModelFromCache('Llama-3.1-8B-Instruct-q4f16_1-MLC');console.log('Model deleted from cache');// Clear all cached modelsasync function clearAllCachedModels() { const models = [ 'Llama-3.1-8B-Instruct-q4f16_1-MLC', 'Qwen2.5-3B-Instruct-q4f16_1-MLC', 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC' ];
for (const model of models) { try { await deleteModelFromCache(model); console.log(`Deleted: ${model}`); } catch (error) { console.error(`Failed to delete ${model}:`, error); } }}Complete Examples
Preload check with user confirmation
import { hasModelInCache, createClient } from '@webllm-io/sdk';
async function initializeWithCacheCheck() { const modelId = 'Llama-3.1-8B-Instruct-q4f16_1-MLC'; const isCached = await hasModelInCache(modelId);
if (!isCached) { const confirmed = confirm( `Model not cached. Download ~6GB?\n\nThis may take several minutes.` );
if (!confirmed) { console.log('User cancelled download'); return null; } }
const client = createClient({ local: { model: modelId }, onProgress: (progress) => { console.log(`${progress.stage}: ${Math.round(progress.progress * 100)}%`); } });
await client.init(); return client;}Cache size estimation
import { hasModelInCache } from '@webllm-io/sdk';
async function estimateCacheSize() { const modelSizes = { 'Llama-3.1-8B-Instruct-q4f16_1-MLC': 6000, // ~6GB 'Qwen2.5-3B-Instruct-q4f16_1-MLC': 3000, // ~3GB 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC': 1500 // ~1.5GB };
let totalSize = 0;
for (const [model, size] of Object.entries(modelSizes)) { const cached = await hasModelInCache(model); if (cached) { totalSize += size; console.log(`${model}: ${size}MB`); } }
console.log(`Total cache size: ~${totalSize}MB`); return totalSize;}Selective cache cleanup
import { hasModelInCache, deleteModelFromCache } from '@webllm-io/sdk';
async function cleanupOldModels(keepModel: string) { const allModels = [ 'Llama-3.1-8B-Instruct-q4f16_1-MLC', 'Qwen2.5-3B-Instruct-q4f16_1-MLC', 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC' ];
for (const model of allModels) { if (model === keepModel) continue;
const cached = await hasModelInCache(model); if (cached) { await deleteModelFromCache(model); console.log(`Deleted: ${model}`); } }}
// Keep only the 1.5B model, delete othersawait cleanupOldModels('Qwen2.5-1.5B-Instruct-q4f16_1-MLC');Cache status dashboard
import { hasModelInCache, deleteModelFromCache } from '@webllm-io/sdk';import { useState, useEffect } from 'react';
interface CacheEntry { model: string; cached: boolean; size: number;}
function CacheDashboard() { const [entries, setEntries] = useState<CacheEntry[]>([]);
useEffect(() => { loadCacheStatus(); }, []);
async function loadCacheStatus() { const models = [ { model: 'Llama-3.1-8B-Instruct-q4f16_1-MLC', size: 6000 }, { model: 'Qwen2.5-3B-Instruct-q4f16_1-MLC', size: 3000 }, { model: 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC', size: 1500 } ];
const results = await Promise.all( models.map(async ({ model, size }) => ({ model, size, cached: await hasModelInCache(model) })) );
setEntries(results); }
async function handleDelete(model: string) { await deleteModelFromCache(model); await loadCacheStatus(); // Refresh }
return ( <div> <h2>Model Cache</h2> <ul> {entries.map((entry) => ( <li key={entry.model}> <span>{entry.model}</span> <span>{entry.cached ? '✓ Cached' : '✗ Not cached'}</span> <span>{entry.size}MB</span> {entry.cached && ( <button onClick={() => handleDelete(entry.model)}> Delete </button> )} </li> ))} </ul> <p> Total: {entries.filter((e) => e.cached).reduce((sum, e) => sum + e.size, 0)}MB </p> </div> );}Switch models with cleanup
import { hasModelInCache, deleteModelFromCache, createClient } from '@webllm-io/sdk';
async function switchModel( currentClient: WebLLMClient | null, newModelId: string) { // Dispose current client if (currentClient) { await currentClient.dispose(); }
// Check if new model is cached const isCached = await hasModelInCache(newModelId);
if (!isCached) { console.log(`Downloading ${newModelId}...`); }
// Create new client with new model const newClient = createClient({ local: { model: newModelId }, onProgress: (progress) => { console.log(`Loading: ${Math.round(progress.progress * 100)}%`); } });
await newClient.init(); return newClient;}
// Usagelet client = await switchModel(null, 'Llama-3.1-8B-Instruct-q4f16_1-MLC');
// Later: switch to smaller model and delete old oneconst oldModel = 'Llama-3.1-8B-Instruct-q4f16_1-MLC';client = await switchModel(client, 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC');await deleteModelFromCache(oldModel);Conditional cache preloading
import { hasModelInCache, checkCapability, createClient } from '@webllm-io/sdk';
async function smartInitialize() { const cap = await checkCapability();
// Determine best model for device let modelId: string; switch (cap.grade) { case 'S': modelId = 'Llama-3.1-8B-Instruct-q4f16_1-MLC'; break; case 'A': modelId = 'Qwen2.5-3B-Instruct-q4f16_1-MLC'; break; default: modelId = 'Qwen2.5-1.5B-Instruct-q4f16_1-MLC'; }
// Check cache status const isCached = await hasModelInCache(modelId);
// Decide whether to use local or cloud const useLocal = isCached || cap.connection.effectiveType === '4g';
const client = createClient({ local: useLocal ? { model: modelId } : false, cloud: !useLocal ? process.env.OPENAI_API_KEY : undefined });
return client;}Browser Compatibility
Both functions require:
- OPFS support - Chrome 102+, Edge 102+, Safari 15.2+
@mlc-ai/web-llminstalled as a peer dependency
If @mlc-ai/web-llm is not installed:
hasModelInCache()returnsfalsedeleteModelFromCache()throws an error
Storage Considerations
OPFS Storage Limits
- Chrome/Edge: ~60% of available disk space
- Firefox: ~50% of available disk space
- Safari: ~1GB default, can request more
Check available storage:
if ('storage' in navigator && 'estimate' in navigator.storage) { const estimate = await navigator.storage.estimate(); console.log(`Used: ${estimate.usage}MB`); console.log(`Available: ${estimate.quota}MB`);}Model Sizes
Typical MLC model sizes:
- Llama-3.1-8B (q4f16_1): ~6GB
- Qwen2.5-3B (q4f16_1): ~3GB
- Qwen2.5-1.5B (q4f16_1): ~1.5GB
- Phi-3.5-mini (q4f16_1): ~2.5GB
Plan storage usage accordingly.
Implementation Notes
Delegation to @mlc-ai/web-llm
Both functions are thin wrappers around @mlc-ai/web-llm APIs:
import * as webllm from '@mlc-ai/web-llm';
async function hasModelInCache(modelId: string): Promise<boolean> { try { return await webllm.hasModelInCache(modelId); } catch { return false; // If web-llm not installed }}
async function deleteModelFromCache(modelId: string): Promise<void> { return await webllm.deleteModelAllInfoInCache(modelId);}When to Use
Use hasModelInCache() when:
- Showing cache status in UI
- Deciding whether to download a model
- Estimating load time
- Preloading models in the background
Use deleteModelFromCache() when:
- User explicitly requests cache cleanup
- Freeing storage for new models
- Implementing cache eviction policies
- Resetting application state
Do NOT use for:
- Automatic cache invalidation (models don’t expire)
- Performance optimization (cache management is already optimized)
- Detecting model updates (MLC models are immutable)
See Also
- createClient() - Client initialization with cache options
- Config Types -
useCacheconfiguration - WebLLMClient - Client interface
- Local Inference Guide - OPFS cache usage