abi / secret-llama
- вторник, 7 мая 2024 г. в 00:00:02
Fully private LLM chatbot that runs entirely with a browser with no server needed. Supports Mistral and LLama 3.
Entirely-in-browser, fully private LLM chatbot supporting Llama 3, Mistral and other open source models.
Big thanks to the inference engine provided by webllm.
Join us on Discord
To run this, you need a modern browser with support for WebGPU. According to caniuse, WebGPU is supported on:
It's also available in Firefox, but it needs to be enabled manually through the dom.webgpu.enabled flag. Safari on MacOS also has experimental support for WebGPU which can be enabled through the WebGPU experimental feature.
In addition to WebGPU support, various models might have specific RAM requirements.
You can try it here.
To compile the React code yourself, download the repo and then, run
yarn
yarn build-and-preview
If you're looking to make changes, run the development environment with live reload:
yarn
yarn dev
| Model | Model Size | 
|---|---|
| TinyLlama-1.1B-Chat-v0.4-q4f32_1-1k | 600MB | 
| Llama-3-8B-Instruct-q4f16_1 ⭐ | 4.3GB | 
| Phi1.5-q4f16_1-1k | 1.2GB | 
| Mistral-7B-Instruct-v0.2-q4f16_1 ⭐ | 4GB | 
We would love contributions to improve the interface, support more models, speed up initial model loading time and fix bugs.
Check out screenshot to code and Pico - AI-powered app builder