DeepSeek-R1-webGPU single HTML/Javascript page with IndexedDB Document Storage and RAG
Open the console (Shift-Ctrl-I) for more info. This single HTML/Javascript Browser LLM is too big for your cell phone. If you don't want to completely download
huggingface.co/onnx-community/DeepSeek-R1-Distill-Qwen-1.5B-ONNX then you should probably close this page.
It will load from cache if downloaded once. Uses the Web-GPU TransformersJS DeepseekR1 model or other models:
Data warning ~1.4 GB saved to cache for LLM
Click for hyperparameters which may or may not work closer to zero more focused, closer to 1 more variety Close to 0 more predictable, closer to 1 more diverse
LLM Loading progress: 0%
Rendered Output:
...
Embedding Model and RAG Settings
To enable RAG, you need to load an Embedding Model first. This model converts text into numerical vectors (embeddings)
which are used to find relevant documents.
Data warning ~50 MB saved to cache for Embedding Model
Embedding Model Loading Progress: 0%
(These documents will be prepended to your prompt as context.)
Local Document Storage (IndexedDB)
Store text documents locally in your browser. These documents can be copied into the prompt to provide context to the LLM.
With the Embedding Model loaded, new documents will automatically have embeddings generated for RAG.
If you have existing documents without embeddings, load the embedding model and then clear/re-add them, or the app will attempt to re-embed them on load.