0obabooga/text-generation-webui: A gradio web UI for running Large Language Modelshttps://github.com/oobabooga/text-generation-webui
A text generation web UI built on Gradio that can run large language models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. The UI features a dropdown menu for switching between models, a notebook mode resembling OpenAI's playground, a chat mode for conversation and role-playing, and an Instruct mode compatible with Alpaca, Vicuna, and Open Assistant formats. Other features include nice HTML output for GPT-4chan, markdown output for GALACTICA with LaTeX rendering, custom chat characters, advanced chat features like sending images and getting audio responses with TTS, efficient text streaming, parameter presets, 8-bit mode, layers splitting across GPU(s), CPU, and disk, CPU mode, FlexGen, DeepSpeed ZeRO-3, API with streaming and without streaming, LLaMA model including 4-bit GPTQ, llama.cpp, RWKV model, LoRA (loading and training), Softprompts, and extensions.