Delete Set public Set private Add tags Delete tags
  Add tag   Cancel
  Delete tag   Cancel
  • • Curated knowledge about art and AI •
  •  
  • About
  • Prompts
  • Tags
  • Login

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editinghttps://dxli94.github.io/BLIP-Diffusion-website/

  • stable_diffusion
  • training
  • lora
  • capturing_concepts
  • stable_diffusion
  • training
  • lora
  • capturing_concepts

BLIP-Diffusion is a new model for generating and editing images based on text prompts and subject images. Unlike previous models, it uses a pre-trained multimodal encoder to represent the subject, allowing for efficient fine-tuning and better preservation of subject details. The model enables the generation of new images based on text prompts and subject images, even without prior training on specific subjects. It also supports image manipulation, style transfer, and editing guided by subject images. The model is trained in two stages to learn subject representation and can be combined with other techniques for more control over the generation and editing process. Overall, BLIP-Diffusion provides a flexible and efficient approach to generate and edit images with specific subjects.

4 days ago Permalink
cluster icon
  • Detailed guide on training embeddings on a person's likeness : A guide on how to train embeddings with textual inversion to learn a person's likeness. The guide assumes the use of the Automatic1111 Web UI and basi...
  • How to extract small LoRA file from custom dreambooth models . Reduce your model sizes! : StableDiffusion : To extract the LoRA model from the base model using Kohya_ss, users can install the repository from GitHub and navigate to the Utilities tab, then sel...
  • Training Stable Diffusion with Dreambooth : Dreambooth is a technique used to teach new concepts to Stable Diffusion using a specialized form of fine-tuning. It is used to place people in fantas...
  • Guide to training with optimized dreambooth stable diffusion : Dreambooth on Stable Diffusion is an implementation of Google's Dreambooth with Stable Diffusion. It enables people to fine-tune a text-to-image model...
  • SD Compendium : Extensive wiki about Stable diffusion.

DreamGPT: Leverage hallucinations from Large Language Models (LLMs) for novelty-driven explorations.https://github.com/DivergentAI/dreamGPT

  • gpt
  • autonomous
  • api
  • gpt
  • autonomous
  • api

DreamGPT, the first GPT-based solution that uses hallucinations from LLMs for divergent thinking to generate new innovative ideas. Hallucinations are often seen as a negative thing, but what if they could be used for our advantage? dreamGPT is here to show you how. The goal of dreamGPT is to explore as many possibilities as possible, as opposed to most other GPT-based solutions which are focused on solving specific problems.


5 days ago Permalink
cluster icon
  • Smartgpt: A program that provides LLMs with the ability to complete complex tasks using plugins. : SmartGPT is an experimental program meant to provide LLMs (particularly GPT-3.5 and GPT-4) with the ability to complete complex tasks without user inp...
  • CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society : The paper explores the potential of developing autonomous cooperation for conversational language models without relying heavily on human input. The p...
  • Tree-of-thought-llm : Tree of Thoughts: Deliberate Problem Solving with Large Language Models : Tree-of-Thought (ToT) aims to enhance the problem-solving capabilities of large language models (LLMs) like GPT-4. The framework utilizes a deliberate...
  • Taxy AI : Taxy AI is an organization focused on transforming the role of AI in human-computer interaction. They currently offer a self-hosted browser extension ...
  • LocalAI: Self-hosted, community-driven simple local OpenAI-compatible API : LocalAI is an API that can be used as a replacement for OpenAI, which supports various models and can run on consumer-grade hardware. It supports ggml...

Tree-of-thought-llm : Tree of Thoughts: Deliberate Problem Solving with Large Language Modelshttps://github.com/ysymyth/tree-of-thought-llm

  • llm
  • autonomous
  • chatgpt
  • gpt
  • llm
  • autonomous
  • chatgpt
  • gpt

Tree-of-Thought (ToT) aims to enhance the problem-solving capabilities of large language models (LLMs) like GPT-4. The framework utilizes a deliberate 'System 2' tree search approach to tackle complex and general problems that LLMs struggle with. The author demonstrates significant improvements on three tasks: the game of 24, creative writing, and crosswords, which GPT-4 and CoT (chain of thought, another approach) find challenging due to the need for planning and searching. The limitations of token-by-token decoding, which lacks lookahead, backtrack, and global exploration, are highlighted as the reason for these difficulties. ToT achieves a tenfold performance boost by leveraging the LLM's ability to generate diverse intermediate thoughts, self-evaluate them through deliberate reasoning, and employ search algorithms like breadth-first search (bfs) or depth-first search (dfs) to systematically explore the problem space.

Alternative repositorie

5 days ago Permalink
cluster icon
  • Chatgpt-demo: A local client based on OpenAI API. : A minimalist node.js local client to use GPT-3.5 or GPT-4.
  • Taxy AI : Taxy AI is an organization focused on transforming the role of AI in human-computer interaction. They currently offer a self-hosted browser extension ...
  • Auto-GPT: An attempt to make GPT-4 fully autonomous. : Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model. This program, driven by GPT-4, chains tog...
  • MiniChain: A tiny library for coding with large language models. : MiniChain is a library used to link prompts together in a sequence, with the ability to manipulate and visualize them using Gradio. Users can ensure t...
  • CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society : The paper explores the potential of developing autonomous cooperation for conversational language models without relying heavily on human input. The p...

CoDi: Any-to-Any Generation via Composable Diffusionhttps://github.com/microsoft/i-Code/tree/main/i-Code-V3

  • multimodal
  • text2image
  • audio
  • multimodal
  • text2image
  • audio

Composable Diffusion (CoDi) is a new generative model that can create different types of outputs (like language, images, videos, or audio) from various inputs. It can generate multiple outputs at the same time and is not limited to specific types of inputs. Even without specific training data, CoDi aligns inputs and outputs to generate any combination of modalities. It uses a unique strategy to create a shared multimodal space, allowing synchronized generation of intertwined modalities.

CoDi

Demo video

1 week ago Permalink
cluster icon
  • Inpainging & Outpainting in Stable Diffusion : Inpainting is a technique used to regenerate parts of an image. To access the inpainting section of the WebUI, click on the img2img tab. Inpainting ha...
  • ComfyUI: A stable diffusion GUI with a graph/nodes interface : ComfyUI is a powerful and modular stable diffusion GUI and backend that enables users to design and execute advanced stable diffusion pipelines using ...
  • Multi-Concept Customization of Text-to-Image Diffusion : The paper proposes Custom Diffusion, an efficient method for augmenting existing text-to-image models to quickly acquire new concepts given a few exam...
  • AUTOMATIC1111 Stable Diffusion web UI : The Stable Diffusion WebUI offers a range of features for generating and processing images, including original txt2img and img2img modes, outpainting,...
  • IF by DeepFloyd Lab : DeepFloyd IF is a text-to-image model that utilizes the large language model T5-XXL-1.1 as a text encoder to generates intelligible and coherent image...

ControlNet v1.1: A complete guide - Stable Diffusion Arthttps://stable-diffusion-art.com/controlnet/

  • stable_diffusion
  • controlnet
  • help
  • stable_diffusion
  • controlnet
  • help

A comprehensive guide to ControlNet v1.1, a Stable Diffusion model that allows users to control image compositions and human poses based on reference images. The guide covers various aspects of ControlNet, including its installation on different platforms (Windows, Mac, Google Colab), settings, and common use cases.

1 week ago Permalink
cluster icon
  • SD Toolset : A mini wiki or cheat sheet for Stable Diffusion. The purpose of the mini wiki is to provide a brief explanation and relevant links to tools that could...
  • SD Compendium : Extensive wiki about Stable diffusion.
  • SD-CN-Animation: Automate video stylization task using StableDiffusion and ControlNet : The SD-CN-Animation project offers automated video stylization and text-to-video generation using StableDiffusion and ControlNet. It provides the abil...
  • UnpromptedControl: Remove unwanted objects and restore images without prompts, powered by ControlNet. : UnpromptedControl is a tool used for guiding StableDiffusion models in image restoration and object removal tasks. By leveraging a simple hack, it all...
  • The Illustrated Stable Diffusion : An introduction to Stable Diffusion and its components. Stable Diffusion is a system consisting of several components and models that can be used for ...

Creating a Consistent Character as a Textual Inversion Embedding with Stable Diffusionhttps://github.com/BelieveDiffusion/tutorials/blob/main/consistent_character_embedding/README.md

  • stable_diffusion
  • help
  • embeddings
  • stable_diffusion
  • help
  • embeddings

The tutorial provides a comprehensive guide on creating consistent characters using Stable Diffusion (SD) and a Textual Inversion embedding. It outlines a five-step process, including generating input images, filtering them based on desired attributes, tagging them for training, training the embedding, and selecting a validated iteration. The tutorial emphasizes the importance of generating high-quality input images, filtering out unwanted variations, and fine-tuning the selection to achieve consistency. By following this tutorial, users can learn how to generate consistent characters with SD and create an embedding that reliably recreates the desired character across different poses, hairstyles, body types, and prompts.

Consistent Character

1 week ago Permalink
cluster icon
  • Detailed guide on training embeddings on a person's likeness : A guide on how to train embeddings with textual inversion to learn a person's likeness. The guide assumes the use of the Automatic1111 Web UI and basi...
  • Stable Diffusion v1.4 - Steps, CFG scale, and Seed resize exploration : This grid presents an exploration of Stable Diffusion v1.4 using the Euler ancestral sampler for the same prompt and seed, but different steps and CFG...
  • SD Hypertextbook : An hypertextbook (guide) on setting up and using Stable Diffusion, a deep learning model. The hypertextbook is organized into chapters, starting with ...
  • Stable Diffusion Prompt book : Prompt engineering is the process of structuring words that can be interpreted and understood by a text-to-image model. This prompt book by OpenArt ai...
  • AI Art Panic | Opinionated Guides : Artificial intelligence (AI) has advanced to the point where it is capable of generating art that is often better than what many human artists can cre...

Smartgpt: A program that provides LLMs with the ability to complete complex tasks using plugins.https://github.com/Cormanz/smartgpt

  • autonomous
  • gpt
  • api
  • autonomous
  • gpt
  • api

SmartGPT is an experimental program meant to provide LLMs (particularly GPT-3.5 and GPT-4) with the ability to complete complex tasks without user input by breaking them down into smaller problems, and collecting information using the internet and other external sources.

Smartgpt

1 week ago Permalink
cluster icon
  • DreamGPT: Leverage hallucinations from Large Language Models (LLMs) for novelty-driven explorations. : DreamGPT, the first GPT-based solution that uses hallucinations from LLMs for divergent thinking to generate new innovative ideas. Hallucinations are...
  • Tree-of-thought-llm : Tree of Thoughts: Deliberate Problem Solving with Large Language Models : Tree-of-Thought (ToT) aims to enhance the problem-solving capabilities of large language models (LLMs) like GPT-4. The framework utilizes a deliberate...
  • Taxy AI : Taxy AI is an organization focused on transforming the role of AI in human-computer interaction. They currently offer a self-hosted browser extension ...
  • CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society : The paper explores the potential of developing autonomous cooperation for conversational language models without relying heavily on human input. The p...
  • GPT-assistant: An experiment to give an autonomous GPT agent access to a browser and have it accomplish tasks : An experiment where an autonomous GPT (Generative Pre-trained Transformer) agent is given access to a browser to perform tasks. For example adding tex...

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifoldhttps://vcai.mpi-inf.mpg.de/projects/DragGAN/

  • gan
  • interactive
  • gan
  • interactive

DragGAN is a novel approach for controlling generative adversarial networks (GANs) in order to synthesize visual content that meets users' needs. It offers precise and flexible controllability over the pose, shape, expression, and layout of generated objects. Unlike existing methods that rely on manual annotations or 3D models, DragGAN enables interactive control by allowing users to "drag" any points of an image to reach desired positions. The approach consists of two main components: feature-based motion supervision and a point tracking approach utilizing discriminative GAN features. By utilizing DragGAN, users can manipulate diverse categories of images, such as animals, cars, humans, and landscapes, with realistic outputs even in challenging scenarios like occluded content and deforming shapes.

Drag Your GAN

1 week ago Permalink
cluster icon
  • No related link

Vault-ai: Give ChatGPT long-term memory using OpenAI + Pinecone Vector Databasehttps://github.com/pashpashpash/vault-ai

  • llm
  • chatgpt
  • training
  • text
  • llm
  • chatgpt
  • training
  • text

OP Vault is a versatile platform that allows users to upload various document types through a simple react frontend, enabling the creation of a customized knowledge base. It leverages advanced algorithms to provide accurate and relevant answers based on the content of the uploaded documents. Users can gain insights into the answers by viewing filenames and specific context snippets. The user-friendly interface of OP Vault makes it easy to explore the capabilities of the OP Stack, a powerful combination of OpenAI and Pinecone Vector Database. Moreover, OP Vault supports large-scale uploads, making it possible to load entire libraries' worth of books, thus expanding the scope of knowledge accessible through the platform. To ensure smooth operation, certain manual dependencies such as node (v19), go (v1.18.9 darwin/arm64), and poppler are required. With its diverse features, OP Vault offers a convenient solution for document upload, accurate retrieval of answers, and efficient exploration of information.

Vault-ai

Demo

2 weeks ago Permalink
cluster icon
  • Simple UI for LLM Model Finetuning : Simple LLM Finetuner is a beginner-friendly interface designed to make it easy to fine-tune various language models using LoRA method via the PEFT lib...
  • A list of totally open alternatives to ChatGPT : List of open alternatives to ChatGPT. ChatGPT is GPT-3.5 finetuned with RLHF (Reinforcement Learning with Human Feedback) for human instruction and ch...
  • Chatgpt-demo: A local client based on OpenAI API. : A minimalist node.js local client to use GPT-3.5 or GPT-4.
  • 0obabooga/text-generation-webui: A gradio web UI for running Large Language Models : A text generation web UI built on Gradio that can run large language models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. The UI features ...
  • Stable-diffusion-webui-chatgpt-utilities: Enables use of ChatGPT directly from the UI : This an extension for stable-diffusion-webui that enables you to use ChatGPT for prompt variations, inspiration.

Instructbliphttps://github.com/salesforce/LAVIS/tree/main/projects/instructblip

  • image2text
  • llm
  • image2text
  • llm

The paper focuses on a vision-language model called InstructBLIP and explores the process of instruction tuning. The authors collect 26 datasets and categorize them for instruction tuning and zero-shot evaluation. They also introduce a method called instruction-aware visual feature extraction. The results show that InstructBLIP achieves the best performance among all models, surpassing BLIP-2 and Flamingo. When fine-tuned for specific tasks, InstructBLIP exhibits exceptional accuracy, such as 90.7% on the ScienceQA IMG task. Through qualitative comparisons, the study highlights InstructBLIP's superiority over other multimodal models, demonstrating its importance in the field of vision-language tasks.

Instructblip

Demo

2 weeks ago Permalink
cluster icon
  • OpenFlamingo-9B Demo : OpenFlamingo is a new tool that helps computers learn how to understand pictures and words together. The OpenFlamingo project aims to develop a multi...
  • MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models : MiniGPT-4, a vision-language model that aligns a frozen visual encoder with a frozen large language model (LLM) using one projection layer. The author...
  • LLaVA: Large Language and Vision Assistant : The article presents a novel approach to large multimodal language models using machine-generated instruction-following data, which has shown promise ...
  • Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality : Vicuna-13B is an open-source chatbot based on a fine-tuned LLaMA base model trained on user-shared conversations collected from ShareGPT.com. The enha...
  • Image Mixer : The model described in the text is an extension of the Stable Diffusion Image Variations model that incorporates multiple CLIP image embeddings. Durin...

SD-CN-Animation: Automate video stylization task using StableDiffusion and ControlNethttps://github.com/volotat/SD-CN-Animation

  • stable_diffusion
  • controlnet
  • animation
  • stable_diffusion
  • controlnet
  • animation

The SD-CN-Animation project offers automated video stylization and text-to-video generation using StableDiffusion and ControlNet. It provides the ability to stylize videos automatically and generate new videos from text input, using various Stable Diffusion models as backbones. The project incorporates the 'RAFT' optical flow estimation algorithm to maintain animation stability and generate occlusion masks for frame generation. In text-to-video mode, it utilizes the 'FloweR' method for predicting optical flow from previous frames. The ControlNet model is recommended for better results in vid2vid mode.

2 weeks ago Permalink
cluster icon
  • ControlNet v1.1: A complete guide - Stable Diffusion Art : A comprehensive guide to ControlNet v1.1, a Stable Diffusion model that allows users to control image compositions and human poses based on reference ...
  • UnpromptedControl: Remove unwanted objects and restore images without prompts, powered by ControlNet. : UnpromptedControl is a tool used for guiding StableDiffusion models in image restoration and object removal tasks. By leveraging a simple hack, it all...
  • Safe and Stable Ckpt2Safetensors Conversion Tool GUI : Safe & Stable is a user-friendly tool designed to convert stable diffusion checkpoint files (.ckpt) to the safer and more secure .safetensors format f...
  • SD Compendium : Extensive wiki about Stable diffusion.
  • Stable-diffusion-webui-chatgpt-utilities: Enables use of ChatGPT directly from the UI : This an extension for stable-diffusion-webui that enables you to use ChatGPT for prompt variations, inspiration.

ImageBind by Meta AIhttps://imagebind.metademolab.com/

  • image2text
  • multimodal
  • image2text
  • multimodal

ImageBind, the first AI model capable of binding data from six modalities at once, without the need for explicit supervision. By recognizing the relationships between these modalities — images and video, audio, text, depth, thermal and inertial measurement units (IMUs) — this breakthrough helps advance AI by enabling machines to better analyze many different forms of information, together.

2 weeks ago Permalink
cluster icon
  • CoDi: Any-to-Any Generation via Composable Diffusion : Composable Diffusion (CoDi) is a new generative model that can create different types of outputs (like language, images, videos, or audio) from variou...
  • Image Mixer : The model described in the text is an extension of the Stable Diffusion Image Variations model that incorporates multiple CLIP image embeddings. Durin...
  • Long-form text-to-images generation (GPT-3 and Stable Diffusion) : Long Stable Diffusion is a pipeline of generative models that can be used to illustrate a full story. Currently, Stable Diffusion can only take in a s...
  • Img2prompt : Get an approximate text prompt, with style, matching an image. Optimized for stable-diffusion (clip ViT-L/14)). The resource is an adapted version of ...
  • OpenFlamingo-9B Demo : OpenFlamingo is a new tool that helps computers learn how to understand pictures and words together. The OpenFlamingo project aims to develop a multi...

UnpromptedControl: Remove unwanted objects and restore images without prompts, powered by ControlNet.https://github.com/vijishmadhavan/UnpromptedControl

  • stable_diffusion
  • controlnet
  • plugin
  • stable_diffusion
  • controlnet
  • plugin

UnpromptedControl is a tool used for guiding StableDiffusion models in image restoration and object removal tasks. By leveraging a simple hack, it allows for the restoration or removal of objects without requiring user prompts, leading to enhanced process efficiency. The tool uses ControlNet and StableDiffusionInpaintPipeline models to guide the inpainting process and restore the image to a more natural-looking state. However, the algorithm currently has limitations in processing images of people's faces and bodies.

UnpromptedControl

3 weeks ago Permalink
cluster icon
  • Tagger - Script for AUTOMATIC1111/stable-diffusion-webui : A script for AUTOMATIC1111/stable-diffusion-webui that allows users to quickly add tags from a list to their prompt. The script adds a separate textbo...
  • openOutpaint: Canvas outpainting for stable diffusion webUI : OpenOutpaint is an intuitive and convenient outpainting tool that provides queueable, cancelable dreams, arbitrary dream reticle size, and an effectiv...
  • Prompt translate script for AUTOMATIC1111/stable-diffusion-webui : 'Prompt translate' script for AUTOMATIC1111/stable-diffusion-webui translate prompt. This script allows you to write a query in promt query in your na...
  • ControlNet extension for AUTOMATIC1111's Stable Diffusion web UI : ControlNet, a neural network structure that adds extra conditions to diffusion models to control them. ControlNet copies the weights of neural network...
  • Stable-diffusion-webui-chatgpt-utilities: Enables use of ChatGPT directly from the UI : This an extension for stable-diffusion-webui that enables you to use ChatGPT for prompt variations, inspiration.

Promptsandbox.io : user-friendly visual canvas for creating chatbots powered by OpenAI APIs with no-codehttps://github.com/eg9y/promptsandbox.io

  • chatgpt
  • prompt
  • visual_programming
  • chatgpt
  • prompt
  • visual_programming

Promptsandbox.io is a node-based visual canvas used for creating chatbots powered by OpenAI APIs. The platform has an intuitive drag-and-drop interface for creating dynamic chains of nodes that perform specific operations as part of the workflow. It is built using React and has various features such as integration with OpenAI APIs, document upload and retrieval, support for various block types, debugging tools, a chatbot gallery, and extensibility with additional node types. Promptsandbox.io provides a seamless experience for users to work with OpenAI APIs and build more complex chatbots.

Promptsandbox

3 weeks ago Permalink
cluster icon
  • Stable-diffusion-webui-chatgpt-utilities: Enables use of ChatGPT directly from the UI : This an extension for stable-diffusion-webui that enables you to use ChatGPT for prompt variations, inspiration.
  • Stable Diffusion Prompt book : Prompt engineering is the process of structuring words that can be interpreted and understood by a text-to-image model. This prompt book by OpenArt ai...
  • Clip retrieval - converting the text query to a CLIP embedding : The clip-retrieval package allows for easy computing of clip embeddings and building of a clip retrieval system. It can be used to quickly compute ima...
  • Chatgpt-demo: A local client based on OpenAI API. : A minimalist node.js local client to use GPT-3.5 or GPT-4.
  • Vault-ai: Give ChatGPT long-term memory using OpenAI + Pinecone Vector Database : OP Vault is a versatile platform that allows users to upload various document types through a simple react frontend, enabling the creation of a custom...

GPT-assistant: An experiment to give an autonomous GPT agent access to a browser and have it accomplish taskshttps://github.com/BuilderIO/gpt-assistant

  • autonomous
  • gpt
  • plugin
  • autonomous
  • gpt
  • plugin

An experiment where an autonomous GPT (Generative Pre-trained Transformer) agent is given access to a browser to perform tasks. For example adding text to a webpage and making restaurant reservations. gpt-assistant requires Node.js, an OpenAI API key, and a Postgres database.

GPT-assistant

3 weeks ago Permalink
cluster icon
  • DreamGPT: Leverage hallucinations from Large Language Models (LLMs) for novelty-driven explorations. : DreamGPT, the first GPT-based solution that uses hallucinations from LLMs for divergent thinking to generate new innovative ideas. Hallucinations are...
  • Auto-GPT: An attempt to make GPT-4 fully autonomous. : Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model. This program, driven by GPT-4, chains tog...
  • Taxy AI : Taxy AI is an organization focused on transforming the role of AI in human-computer interaction. They currently offer a self-hosted browser extension ...
  • Micro-gpt: A minimal generic autonomous agent. : This project aimed to develop a minimal autonomous llm agent using Python code and shell commands, with minimal external dependencies. The agent is ex...
  • MiniChain: A tiny library for coding with large language models. : MiniChain is a library used to link prompts together in a sequence, with the ability to manipulate and visualize them using Gradio. Users can ensure t...

IF by DeepFloyd Labhttps://github.com/deep-floyd/IF

  • stable_diffusion
  • text2image
  • image2text
  • image_generation
  • stable_diffusion
  • text2image
  • image2text
  • image_generation

DeepFloyd IF is a text-to-image model that utilizes the large language model T5-XXL-1.1 as a text encoder to generates intelligible and coherent image alongside with text. The model is capable of incorporating text into images, generating a high degree of photorealism and the ability to generate images with non-standard aspect ratios. It can also modify style, patterns, and details in images without the need for fine-tuning. DeepFloyd IF is modular, cascaded, and works in pixel space, utilizing diffusion models that inject random noise into data before reversing the process to generate new data samples from the noise.

deep floyd

Demo

3 weeks ago Permalink
cluster icon
  • AUTOMATIC1111 Stable Diffusion web UI : The Stable Diffusion WebUI offers a range of features for generating and processing images, including original txt2img and img2img modes, outpainting,...
  • Long-form text-to-images generation (GPT-3 and Stable Diffusion) : Long Stable Diffusion is a pipeline of generative models that can be used to illustrate a full story. Currently, Stable Diffusion can only take in a s...
  • Paella: Simple & Efficient Text-To-Image generation : Paella is an easy-to-use text-to-image model that can turn text into pictures. It was inspired by earlier models but has simpler code for training and...
  • Cutting Off Prompt Effect : This stable-diffusion-webui extension aims to limit the influence of certain tokens in language models by rewriting them as padding tokens. This is im...
  • DALL·E 2 : DALL-E and DALL-E 2 are deep learning models developed by OpenAI that generate digital images from natural language descriptions. DALL-E can generate ...

Inpaint anything using Segment Anything and inpainting models.https://github.com/geekyutao/Inpaint-Anything

  • stable_diffusion
  • llm
  • segmentation
  • stable_diffusion
  • llm
  • segmentation

Inpaint Anything is an innovative tool that seamlessly inpaints images, videos, and 3D scenes by allowing users to remove, fill, or replace objects with just a few clicks. It leverages advanced vision models like Segment Anything Model (SAM), LaMa, and Stable Diffusion (SD) to achieve these tasks. With support for multiple aspect ratios and resolutions up to 2K, Inpaint Anything offers a user-friendly interface for various modalities, including images, videos, and 3D scenes. The tool is continuously improving with new features and functionalities, making it an accessible and powerful solution for users seeking advanced inpainting capabilities.

4 weeks ago Permalink
cluster icon
  • Stable-diffusion-webui-chatgpt-utilities: Enables use of ChatGPT directly from the UI : This an extension for stable-diffusion-webui that enables you to use ChatGPT for prompt variations, inspiration.
  • Guide to training with optimized dreambooth stable diffusion : Dreambooth on Stable Diffusion is an implementation of Google's Dreambooth with Stable Diffusion. It enables people to fine-tune a text-to-image model...
  • InstructPix2Pix : Learning to Follow Image Editing Instructions : A method for editing images using human instructions, which involves providing an input image and a written instruction that tells the model what to d...
  • Training Stable Diffusion with Dreambooth : Dreambooth is a technique used to teach new concepts to Stable Diffusion using a specialized form of fine-tuning. It is used to place people in fantas...
  • Inpainging & Outpainting in Stable Diffusion : Inpainting is a technique used to regenerate parts of an image. To access the inpainting section of the WebUI, click on the img2img tab. Inpainting ha...

WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructionshttps://github.com/nlpxucan/WizardLM

  • llm
  • models
  • llm
  • models

WizardLM is a pre-trained language model that can follow complex instructions using Evol-Instruct - a method that uses language models instead of humans to automatically produce open-domain instructions of various difficulty levels. WizardLM is still in development and will continue to improve by training on larger scales, adding more training data, and innovating more advanced large-model training methods. To fine-tune WizardLM model, alpaca_evol_instruct_70k.json containing 70K instruction-following data generated from Evol-Instruct was used. In terms of human evaluation, WizardLM achieved significantly better results than Alpaca and Vicuna-7b models on diverse user-oriented instructions including difficult coding generation, debugging, math, reasoning, complex formats, academic writing, and extensive disciplines. Additionally, in the high-difficulty section of the human evaluation test set, WizardLM even outperforms ChatGPT, indicating its significant potential to handle complex instructions.

WizardLM

Demo Link
Demo Backup 1

1 month ago Permalink
cluster icon
  • 0obabooga/text-generation-webui: A gradio web UI for running Large Language Models : A text generation web UI built on Gradio that can run large language models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. The UI features ...
  • Auto-GPT: An attempt to make GPT-4 fully autonomous. : Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model. This program, driven by GPT-4, chains tog...
  • Inpaint anything using Segment Anything and inpainting models. : Inpaint Anything is an innovative tool that seamlessly inpaints images, videos, and 3D scenes by allowing users to remove, fill, or replace objects wi...
  • A list of totally open alternatives to ChatGPT : List of open alternatives to ChatGPT. ChatGPT is GPT-3.5 finetuned with RLHF (Reinforcement Learning with Human Feedback) for human instruction and ch...
  • Simple UI for LLM Model Finetuning : Simple LLM Finetuner is a beginner-friendly interface designed to make it easy to fine-tune various language models using LoRA method via the PEFT lib...

HuggingChathttps://huggingface.co/chat/

  • llm
  • gradio
  • text
  • llm
  • gradio
  • text

Hugging Face, a company and AI community that provides free open source tools for machine learning and AI apps, has released HuggingChat, an open source ChatGPT clone that is available for anyone to use or download. The app is based on the Open Assistant Conversational AI Model by Large-scale Artificial Intelligence Open Network (LAION), a global non-profit organization dedicated to democratizing ML research and its applications. HuggingChat was trained with the OpenAssistant Conversations Dataset (OASST1), which was collected up to April 12, 2023, and used reinforcement learning from human feedback methodology to create a high quality human-annotated dataset. The dataset is the product of a worldwide crowdsourcing effort by over 13,000 volunteers.

1 month ago Permalink
cluster icon
  • 0obabooga/text-generation-webui: A gradio web UI for running Large Language Models : A text generation web UI built on Gradio that can run large language models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. The UI features ...
  • Vault-ai: Give ChatGPT long-term memory using OpenAI + Pinecone Vector Database : OP Vault is a versatile platform that allows users to upload various document types through a simple react frontend, enabling the creation of a custom...
  • A list of totally open alternatives to ChatGPT : List of open alternatives to ChatGPT. ChatGPT is GPT-3.5 finetuned with RLHF (Reinforcement Learning with Human Feedback) for human instruction and ch...
  • OpenFlamingo-9B Demo : OpenFlamingo is a new tool that helps computers learn how to understand pictures and words together. The OpenFlamingo project aims to develop a multi...
  • Chatgpt-demo: A local client based on OpenAI API. : A minimalist node.js local client to use GPT-3.5 or GPT-4.

Token Merging : Speed up Stable Diffusion with this one simple trick!https://github.com/dbolya/tomesd#installation

  • stable_diffusion
  • prompt
  • stable_diffusion
  • prompt

Token Merging (ToMe) is a technique used to speed up transformers by merging redundant tokens, which helps reduce the workload for the transformer without compromising quality. The technique is applied to the underlying transformer blocks in Stable Diffusion, minimizing quality loss while preserving the speed-up and memory benefits. It works without training and can be used for any Stable Diffusion model, reducing the workload by up to 60%. ToMe for SD is not another efficient reimplementation of transformer modules, but an actual reduction of the total workload required to generate an image. The results of ToMe for SD show that it produces images similar to the originals, while being faster and using less memory, making it an efficient tool for image generation.


1 month ago Permalink
cluster icon
  • Stable Diffusion Prompt book : Prompt engineering is the process of structuring words that can be interpreted and understood by a text-to-image model. This prompt book by OpenArt ai...
  • The Stability Photoshop Plugin : The Stability Photoshop plugin enables users to generate and edit images using both Stable Diffusion and DALL•E 2 directly within Photoshop. The plugi...
  • Stable Diffusion in Code (AI Image Generation) - Computerphile :
  • openOutpaint: Canvas outpainting for stable diffusion webUI : OpenOutpaint is an intuitive and convenient outpainting tool that provides queueable, cancelable dreams, arbitrary dream reticle size, and an effectiv...
  • ComfyUI: A stable diffusion GUI with a graph/nodes interface : ComfyUI is a powerful and modular stable diffusion GUI and backend that enables users to design and execute advanced stable diffusion pipelines using ...


(135)
1 / 7
Links per page
  • 20
  • 50
  • 100
Filter untagged links

 

 

 
Fold Fold all Expand Expand all Are you sure you want to delete this link? Are you sure you want to delete this tag? The personal, minimalist, super-fast, database free, bookmarking service by the Shaarli community