Delete Set public Set private Add tags Delete tags
  Add tag   Cancel
  Delete tag   Cancel
  • • Curated knowledge about art and AI •
  •  
  • About
  • Lora
  • Prompts
  • Tags
  • Login
83 results tagged stable_diffusion

PIXART-αhttps://pixart-alpha.github.io/

  • stable_diffusion
  • text2image
  • training
  • stable_diffusion
  • text2image
  • training

PIXART-α is a low-cost and efficient text-to-image (T2I) model that produces high-quality images. By utilizing a Transformer-based diffusion model, PIXART-α achieves state-of-the-art image generation standards and supports high-resolution synthesis. The authors propose three core designs, including a decomposed training strategy, an efficient T2I Transformer with cross-attention modules, and the use of high-informative data for auto-labeling. As a result, PIXART-α significantly reduces training time, saving costs and reducing CO2 emissions. The model demonstrates excellent image quality, artistry, and semantic control, making it a valuable tool for the AIGC community and startups seeking to build their own affordable and high-quality generative models.

PIXART-α

1 month ago Permalink
cluster icon
  • Paella: Simple & Efficient Text-To-Image generation : Paella is an easy-to-use text-to-image model that can turn text into pictures. It was inspired by earlier models but has simpler code for training and...
  • Erasing Concepts from Diffusion Models : Making sure that diffusion model-generated images are safe from undesirable content and copyrighted material is a serious concern. Previous methods fo...
  • GLIGEN:Open-Set Grounded Text-to-Image Generation. : GLIGEN is a new approach that enhances text-to-image generation models by allowing them to be conditioned on additional grounding inputs. It achieves ...
  • Stable Diffusion Wildcards Collection : A collection of wildcards to use with Stable Diffusion
  • IP-Adapter: The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt. : IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. An IP-Adapter ...

InstaFlowhttps://github.com/gnobitab/InstaFlow

  • stable_diffusion
  • stable_diffusion

InstaFlow is an ultra-fast, one-step image generator that overcomes computational constraints of diffusion models in text-to-image generation. It achieves image quality similar to Stable Diffusion but requires significantly lesser computational resources due to the use of a Rectified Flow technique. This technique enables training of probability flows with straight trajectories, requiring just a single step for quick inference. InstaFlow's advantages include ultra-fast inference, up to 90% faster than Stable Diffusion, high-quality image generation, and a simple and efficient training process. It produces results equivalent to top text-to-image GANs, like StyleGAN-T, and with the help of a pre-trained Stable Diffusion model, InstaFlow-0.9B requires only 199 A100 GPU days to train.

InstaFlow

1 month ago Permalink
cluster icon
  • Fondant : Large-scale data processing made easy and reusable : Fondant is an open-source framework designed to simplify and accelerate large-scale data processing. It allows for the reuse of containerized componen...
  • Creating a Consistent Character as a Textual Inversion Embedding with Stable Diffusion : The tutorial provides a comprehensive guide on creating consistent characters using Stable Diffusion (SD) and a Textual Inversion embedding. It outlin...
  • E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models : A novel approach to personalizing text-to-image models using diffusers. The current methods for personalization face challenges like long training tim...
  • SD Hypertextbook : An hypertextbook (guide) on setting up and using Stable Diffusion, a deep learning model. The hypertextbook is organized into chapters, starting with ...
  • Restart Sampling for Improving Generative Processes : A new algorithm called Restart has been developed to improve the speed and quality of generative processes. These processes use complex differential e...

NeoPrompt Pro - Smart prompt generation tool for AI artshttps://neoprompt.pro/

  • stable_diffusion
  • prompt
  • tool
  • help
  • stable_diffusion
  • prompt
  • tool
  • help

NeoPrompt is a tool designed to make the creation of AI art more accessible and less time-consuming by providing a comprehensive framework for generating unique prompts. Catering to both beginners and expert artists, it features a repository of imaginative words and phrases and offers explanatory insights into its prompt generation methods. It's compatible with popular AI art platforms like Midjourney and Stable Diffusion, and the team is continuously working on enhancing its compatibility and accuracy.

1 month ago Permalink
cluster icon
  • Stable Diffusion Prompt book : Prompt engineering is the process of structuring words that can be interpreted and understood by a text-to-image model. This prompt book by OpenArt ai...
  • Creating a Consistent Character as a Textual Inversion Embedding with Stable Diffusion : The tutorial provides a comprehensive guide on creating consistent characters using Stable Diffusion (SD) and a Textual Inversion embedding. It outlin...
  • Stable Diffusion in Code (AI Image Generation) - Computerphile :
  • Absolute beginner's guide to Stable Diffusion AI image - Stable Diffusion Art : The article provides a beginner's guide to using Stable Diffusion, an AI model that generates images from text input. It includes an overview of Stabl...
  • Clip retrieval - converting the text query to a CLIP embedding : The clip-retrieval package allows for easy computing of clip embeddings and building of a clip retrieval system. It can be used to quickly compute ima...

Fondant : Large-scale data processing made easy and reusablehttps://fondant.ai/en/stable/

  • stable_diffusion
  • training
  • dataset
  • database
  • controlnet
  • stable_diffusion
  • training
  • dataset
  • database
  • controlnet

Fondant is an open-source framework designed to simplify and accelerate large-scale data processing. It allows for the reuse of containerized components across pipelines and execution environments, enabling sharing within the community. Fondant provides plug-and-play pipelines for various tasks such as AI image generation model fine-tuning, large language model fine-tuning, and code generation model fine-tuning. It also offers a library of reusable components for tasks such as data extraction, filtering, removal of unwanted content, data transformation, data tuning, and data enrichment. The framework supports multimodal capabilities, standardized Python/Pandas-based custom component creation, and production-ready scalable deployment. It also integrates with multiple cloud platforms. Fondant's main goal is to give users control over their data and simplify the building of pipelines for large-scale data processing. The text provides information on how to get started with Fondant and showcases example pipelines for tasks like filtering a creative commons image dataset and fine-tuning models like ControlNet and Stable Diffusion.

https://github.com/ml6team/fondant

2 months ago Permalink
cluster icon
  • All 1,833 artists that are represented in the Stable Diffusion 1.4 Model : This page displays a list of 1,833 artists and their associated image tags, ordered by frequency of representation in the Stable Diffusion 1.4 Model. ...
  • IP-Adapter: The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt. : IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. An IP-Adapter ...
  • E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models : A novel approach to personalizing text-to-image models using diffusers. The current methods for personalization face challenges like long training tim...
  • SD-CN-Animation: Automate video stylization task using StableDiffusion and ControlNet : The SD-CN-Animation project offers automated video stylization and text-to-video generation using StableDiffusion and ControlNet. It provides the abil...
  • ControlNetMediaPipeFace : This dataset is designed to train a ControlNet with human facial expressions. It includes keypoints for pupils to allow gaze direction. Training has b...

IP-Adapter: The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.https://github.com/tencent-ailab/IP-Adapter/

  • image2image
  • remix
  • stable_diffusion
  • comfyui
  • lora
  • training
  • image2image
  • remix
  • stable_diffusion
  • comfyui
  • lora
  • training

IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. An IP-Adapter with only 22M parameters can achieve comparable or even better performance to a fine-tuned image prompt model. IP-Adapter can be generalized not only to other custom models fine-tuned from the same base model, but also to controllable generation using existing controllable tools. Moreover, the image prompt can also work well with the text prompt to accomplish multimodal image generation.


https://github.com/cubiq/ComfyUI_IPAdapter_plus

2 months ago Permalink
cluster icon
  • E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models : A novel approach to personalizing text-to-image models using diffusers. The current methods for personalization face challenges like long training tim...
  • StyleDrop: Text-To-Image Generation in Any Style : StyleDrop is a technology that generates images in any desired style using text-to-image transformer, Muse. The technology captures nuances of user-pr...
  • How to extract small LoRA file from custom dreambooth models . Reduce your model sizes! : StableDiffusion : To extract the LoRA model from the base model using Kohya_ss, users can install the repository from GitHub and navigate to the Utilities tab, then sel...
  • Key-Locked Rank One Editing for Text-to-Image Personalization : Perfusion is a text-to-image personalization method that creatively portrays personalized objects with significant changes in appearance while maintai...
  • BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing : BLIP-Diffusion is a new model for generating and editing images based on text prompts and subject images. Unlike previous models, it uses a pre-traine...

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuninghttps://github.com/guoyww/animatediff/

  • animation
  • stable_diffusion
  • text2image
  • animation
  • stable_diffusion
  • text2image

With the advance of text-to-image models (e.g., Stable Diffusion) and corresponding personalization techniques such as DreamBooth and LoRA, everyone can manifest their imagination into high-quality images at an affordable cost. Subsequently, there is a great demand for image animation techniques to further combine generated static images with motion dynamics. In this report, we propose a practical framework to animate most of the existing personalized text-to-image models once and for all, saving efforts in model-specific tuning. At the core of the proposed framework is to insert a newly initialized motion modeling module into the frozen text-to-image model and train it on video clips to distill reasonable motion priors. Once trained, by simply injecting this motion modeling module, all personalized versions derived from the same base T2I readily become text-driven models that produce diverse and personalized animated images. We conduct our evaluation on several public representative personalized text-to-image models across anime pictures and realistic photographs, and demonstrate that our proposed framework helps these models generate temporally smooth animation clips while preserving the domain and diversity of their outputs.

2 months ago Permalink
cluster icon
  • AUTOMATIC1111 Stable Diffusion web UI : The Stable Diffusion WebUI offers a range of features for generating and processing images, including original txt2img and img2img modes, outpainting,...
  • Paella: Simple & Efficient Text-To-Image generation : Paella is an easy-to-use text-to-image model that can turn text into pictures. It was inspired by earlier models but has simpler code for training and...
  • Inpainging & Outpainting in Stable Diffusion : Inpainting is a technique used to regenerate parts of an image. To access the inpainting section of the WebUI, click on the img2img tab. Inpainting ha...
  • Multi-Concept Customization of Text-to-Image Diffusion : The paper proposes Custom Diffusion, an efficient method for augmenting existing text-to-image models to quickly acquire new concepts given a few exam...
  • Implementation of Paint-with-words with Stable Diffusion : The author of the article discusses implementing the "painting with word" method proposed by researchers from NVIDIA, called eDiffi, with Stable Diffu...

Key-Locked Rank One Editing for Text-to-Image Personalizationhttps://research.nvidia.com/labs/par/Perfusion/

  • image2image
  • training
  • stable_diffusion
  • image2image
  • training
  • stable_diffusion

Perfusion is a text-to-image personalization method that creatively portrays personalized objects with significant changes in appearance while maintaining their identity. It introduces a novel mechanism called "Key-Locking" to maintain high visual fidelity and allow creative control, combine personalized concepts, and keep a small model size. Perfusion achieves this through dynamic rank-1 updates to the underlying text-to-image model and locks new concepts' cross-attention Keys to their superordinate category. It also uses a gated rank-1 approach to control the influence of learned concepts during inference and combine multiple concepts, enabling a trade-off between visual fidelity and textual alignment. With just a 100KB trained model, Perfusion covers different operating points across the Pareto front without additional training. It outperforms strong baselines in both qualitative and quantitative terms, enabling personalization even in one-shot settings and displaying personalized object interactions in novel ways. The method also allows for efficient control of visual-textual alignment and demonstrates different variations of key-locking. Additionally, Perfusion concepts trained with a vanilla diffusion model can generalize to fine-tuned variants.


Pytorch implementation : https://github.com/lucidrains/perfusion-pytorch

2 months ago Permalink
cluster icon
  • GLIGEN:Open-Set Grounded Text-to-Image Generation. : GLIGEN is a new approach that enhances text-to-image generation models by allowing them to be conditioned on additional grounding inputs. It achieves ...
  • E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models : A novel approach to personalizing text-to-image models using diffusers. The current methods for personalization face challenges like long training tim...
  • Composer: Creative and Controllable Image Synthesis with Composable Conditions : Composer introduces a generative model that addresses the limited controllability of existing large-scale models. By decomposing images into represent...
  • IP-Adapter: The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt. : IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. An IP-Adapter ...
  • DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data : DreamSim is a new algorithm designed to compare images holistically by assessing images on their mid-level attributes such as object layout, position,...

E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Modelshttps://github.com/mkshing/e4t-diffusion

  • stable_diffusion
  • image2image
  • training
  • stable_diffusion
  • image2image
  • training

A novel approach to personalizing text-to-image models using diffusers. The current methods for personalization face challenges like long training times, high storage requirements, and loss of identity. To address these issues, the proposed method utilizes an encoder-based domain-tuning approach. This approach involves underfitting on a large set of concepts from a given domain to improve generalization and enable quick addition of novel concepts from the same domain. The method consists of two components: an encoder that maps a target concept image to a word embedding representing the concept, and regularized weight-offsets for the text-to-image model that effectively incorporate additional concepts. By leveraging these components, the model can be personalized with a single image and as few as 5 training steps, significantly reducing personalization time while maintaining quality.

E4t-diffusion

2 months ago Permalink
cluster icon
  • Key-Locked Rank One Editing for Text-to-Image Personalization : Perfusion is a text-to-image personalization method that creatively portrays personalized objects with significant changes in appearance while maintai...
  • IP-Adapter: The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt. : IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. An IP-Adapter ...
  • Composer: Creative and Controllable Image Synthesis with Composable Conditions : Composer introduces a generative model that addresses the limited controllability of existing large-scale models. By decomposing images into represent...
  • GLIGEN:Open-Set Grounded Text-to-Image Generation. : GLIGEN is a new approach that enhances text-to-image generation models by allowing them to be conditioned on additional grounding inputs. It achieves ...
  • StyleDrop: Text-To-Image Generation in Any Style : StyleDrop is a technology that generates images in any desired style using text-to-image transformer, Muse. The technology captures nuances of user-pr...

GLIGEN:Open-Set Grounded Text-to-Image Generation.https://gligen.github.io/

  • stable_diffusion
  • image2image
  • training
  • stable_diffusion
  • image2image
  • training

GLIGEN is a new approach that enhances text-to-image generation models by allowing them to be conditioned on additional grounding inputs. It achieves impressive performance in generating images from text by including caption and bounding box condition inputs. GLIGEN takes advantage of a pre-trained model by freezing its weights and adding trainable layers to incorporate grounding information. This enables the model to have a better understanding of spatial relationships and concepts. GLIGEN outperforms other models in generating images based on specific instructions and can handle counterfactual scenarios. It can also use reference images for more detailed and stylistic output. Additionally, GLIGEN has the capability to inpaint images based on provided bounding boxes. Overall, GLIGEN improves the controllability and versatility of text-to-image generation models.

3 months ago Permalink
cluster icon
  • Key-Locked Rank One Editing for Text-to-Image Personalization : Perfusion is a text-to-image personalization method that creatively portrays personalized objects with significant changes in appearance while maintai...
  • Composer: Creative and Controllable Image Synthesis with Composable Conditions : Composer introduces a generative model that addresses the limited controllability of existing large-scale models. By decomposing images into represent...
  • IP-Adapter: The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt. : IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. An IP-Adapter ...
  • E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models : A novel approach to personalizing text-to-image models using diffusers. The current methods for personalization face challenges like long training tim...
  • DreamArtist : stable diffusion webui with contrastive prompt tuning : The article discusses the challenges faced by large-scale text-to-image generation models in synthesizing high-quality images with novel concepts. Cur...

Composer: Creative and Controllable Image Synthesis with Composable Conditionshttps://damo-vilab.github.io/composer-page/

  • stable_diffusion
  • image2image
  • training
  • stable_diffusion
  • image2image
  • training

Composer introduces a generative model that addresses the limited controllability of existing large-scale models. By decomposing images into representative factors and utilizing a diffusion model to recompose them with these factors as conditions, Composer allows for flexible control over the output image while maintaining quality and creativity. It supports various levels of conditions, such as text description, depth map, sketch, and color histogram. Furthermore, Composer serves as a general framework for classical generative tasks without requiring retraining. The composition results demonstrate the ability to combine elements like text and depth, masked image and text, sketch, depth, and embedding, and more. Additionally, the model facilitates image variations, interpolations, reconfigurations, and region-specific editing, as well as tasks like image translation, style transfer, pose transfer, and virtual try-on.



3 months ago Permalink
cluster icon
  • E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models : A novel approach to personalizing text-to-image models using diffusers. The current methods for personalization face challenges like long training tim...
  • IP-Adapter: The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt. : IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. An IP-Adapter ...
  • GLIGEN:Open-Set Grounded Text-to-Image Generation. : GLIGEN is a new approach that enhances text-to-image generation models by allowing them to be conditioned on additional grounding inputs. It achieves ...
  • Key-Locked Rank One Editing for Text-to-Image Personalization : Perfusion is a text-to-image personalization method that creatively portrays personalized objects with significant changes in appearance while maintai...
  • DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data : DreamSim is a new algorithm designed to compare images holistically by assessing images on their mid-level attributes such as object layout, position,...

Restart Sampling for Improving Generative Processeshttps://github.com/Newbeeer/diffusion_restart_sampling

  • stable_diffusion
  • addon
  • stable_diffusion
  • addon

A new algorithm called Restart has been developed to improve the speed and quality of generative processes. These processes use complex differential equations, which usually has a trade-off between quick results and accuracy. Existing methods like ODE and SDE samplers either work fast but reach a performance limit or offer better quality at a slower rate. The Restart algorithm outperforms these existing methods by better managing these sampling errors. In tests, it produced faster and higher quality results, surpassing SDE results on different platforms and maintaining better balance in large-scale model testing.

Restart Sampling

5 months ago Permalink
cluster icon
  • DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data : DreamSim is a new algorithm designed to compare images holistically by assessing images on their mid-level attributes such as object layout, position,...
  • UnpromptedControl: Remove unwanted objects and restore images without prompts, powered by ControlNet. : UnpromptedControl is a tool used for guiding StableDiffusion models in image restoration and object removal tasks. By leveraging a simple hack, it all...
  • Composer: Creative and Controllable Image Synthesis with Composable Conditions : Composer introduces a generative model that addresses the limited controllability of existing large-scale models. By decomposing images into represent...
  • Stable-diffusion-webui-chatgpt-utilities: Enables use of ChatGPT directly from the UI : This an extension for stable-diffusion-webui that enables you to use ChatGPT for prompt variations, inspiration.
  • Civitai | Stable Diffusion models, embeddings, hypernetworks and more : The text introduces Civitai, a platform that allows users to share and discover resources for creating AI art. Civitai provides users with custom mode...

AI or Not : AI Image Classification & Authenticity Detectionhttps://www.aiornot.com/

  • image
  • detection
  • ai
  • midjourney
  • stable_diffusion
  • dalle
  • image
  • detection
  • ai
  • midjourney
  • stable_diffusion
  • dalle

AI company Optic has developed a web tool called AI or Not, which aims to combat misinformation spread through AI-generated images. The tool scans images and quickly determines whether they were generated by artificial intelligence or by humans. Optic claims that its algorithms provide highly accurate results with a precision rate of 95%. However, users may have concerns about privacy when uploading images to the tool. Optic states that uploaded images and URLs are not stored on its servers longer than necessary and that they adhere to data protection regulations. By analyzing images and detecting signs of AI generation, the company aims to improve its algorithms and machine learning techniques.

AI or Not

5 months ago Permalink
cluster icon
  • Word-As-Image for Semantic Typography : The article presents a method for creating word-as-image illustrations automatically, which involves creating a visualization of the meaning of a word...
  • Stable Diffusion v1.4 - Steps, CFG scale, and Seed resize exploration : This grid presents an exploration of Stable Diffusion v1.4 using the Euler ancestral sampler for the same prompt and seed, but different steps and CFG...
  • Declassifier - Humans of AI : Humans of AI is an online exhibition that showcases three works based on the COCO image dataset. The exhibition aims to credit and applaud the photogr...
  • 1111101000 Robots - Ben Barry / A book of 1000 paintings and illustrations of robots created by artificial intelligence. : A book of 1000 paintings and illustrations of robots created by artificial intelligence. The author generated all of the images in this book by writin...
  • This X Does Not Exist : A collection of GAN. GAN is a new generator architecture for generative adversarial networks that enables unsupervised separation of high-level attrib...

An Edit Friendly DDPM Noise Space: Inversion and Manipulationshttps://arxiv.org/abs/2304.06140

  • stable_diffusion
  • image2image
  • stable_diffusion
  • image2image

A new method for inverting Deep Diffusion Probabilistic Models (DDPMs) by using noise maps to encode image structure and facilitate image editing. The authors demonstrate that this method can be used for text-based image editing tasks and can generate diverse results for any given image and text. The article compares this method to other image editing techniques such as P2P, DDIM inversion, and Plug-and-Play (PnP) and shows how this new method can perform better in terms of fidelity, adherence to target prompts, and support of diversity among generated outputs.

Demo

5 months ago Permalink
cluster icon
  • DreamArtist : stable diffusion webui with contrastive prompt tuning : The article discusses the challenges faced by large-scale text-to-image generation models in synthesizing high-quality images with novel concepts. Cur...
  • DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data : DreamSim is a new algorithm designed to compare images holistically by assessing images on their mid-level attributes such as object layout, position,...
  • GLIGEN:Open-Set Grounded Text-to-Image Generation. : GLIGEN is a new approach that enhances text-to-image generation models by allowing them to be conditioned on additional grounding inputs. It achieves ...
  • E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models : A novel approach to personalizing text-to-image models using diffusers. The current methods for personalization face challenges like long training tim...
  • Key-Locked Rank One Editing for Text-to-Image Personalization : Perfusion is a text-to-image personalization method that creatively portrays personalized objects with significant changes in appearance while maintai...

Stable Diffusion Sketch, an Android app used on automatic1111's Stable Diffusion Web UIhttps://github.com/jordenyt/stable_diffusion_sketch

  • stable_diffusion
  • android
  • drawing
  • controlnet
  • image2image
  • stable_diffusion
  • android
  • drawing
  • controlnet
  • image2image

Stable Diffusion Sketch is an Android app that allows users to create colorful sketches and enhance them using various modes of Stable Diffusion. Users can create new paintings using blank canvas, camera capture, Stable Diffusion txt2img, and images shared from other apps. The app also offers preset modes, custom modes, painting tools, undo/redo functionality, and options to adjust prompt prefix, postfix, and negative prompt values. There are three canvas aspect ratios available: landscape, portrait, and square. Users can also enlarge the image using the upscaler feature or delete them from the main screen. Grouping related sketches is also possible, and the app supports multiple ControlNet.

Stable Diffusion Sketch

5 months ago Permalink
cluster icon
  • ControlNet extension for AUTOMATIC1111's Stable Diffusion web UI : ControlNet, a neural network structure that adds extra conditions to diffusion models to control them. ControlNet copies the weights of neural network...
  • Key-Locked Rank One Editing for Text-to-Image Personalization : Perfusion is a text-to-image personalization method that creatively portrays personalized objects with significant changes in appearance while maintai...
  • ControlNet v1.1: A complete guide - Stable Diffusion Art : A comprehensive guide to ControlNet v1.1, a Stable Diffusion model that allows users to control image compositions and human poses based on reference ...
  • DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data : DreamSim is a new algorithm designed to compare images holistically by assessing images on their mid-level attributes such as object layout, position,...
  • IP-Adapter: The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt. : IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. An IP-Adapter ...

DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Datahttps://dreamsim-nights.github.io/

  • stable_diffusion
  • image2image
  • image_generation
  • stable_diffusion
  • image2image
  • image_generation

DreamSim is a new algorithm designed to compare images holistically by assessing images on their mid-level attributes such as object layout, position, and semantic context. Unlike standard image comparison algorithms that only compare images based on colors and textures, DreamSim evaluates an image on its overall appearance and composition, giving it a more well-rounded comparison. The algorithm uses synthetic data to train its network but can still perform well on real images. A key point is that DreamSim showed better results than existing comparison algorithms in identifying objects and overall semantic context within an image. In summary, DreamSim is a valuable image comparison tool that captures features beyond low-level colors and textures while providing a stronger foundation for visual analysis.

DreamSim

5 months ago Permalink
cluster icon
  • Key-Locked Rank One Editing for Text-to-Image Personalization : Perfusion is a text-to-image personalization method that creatively portrays personalized objects with significant changes in appearance while maintai...
  • DreamArtist : stable diffusion webui with contrastive prompt tuning : The article discusses the challenges faced by large-scale text-to-image generation models in synthesizing high-quality images with novel concepts. Cur...
  • E4t-diffusion : Implementation of Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models : A novel approach to personalizing text-to-image models using diffusers. The current methods for personalization face challenges like long training tim...
  • Mist - An image preprocessing tool to protect artworks from AI-for-Art based mimicry : Mist is a powerful image preprocessing tool developed to protect images from being mimicked by AI-for-Art applications. It adds watermarks to images, ...
  • Paella: Simple & Efficient Text-To-Image generation : Paella is an easy-to-use text-to-image model that can turn text into pictures. It was inspired by earlier models but has simpler code for training and...

StableStudio: Community interface for generative AIhttps://github.com/Stability-AI/StableStudio

  • stable_diffusion
  • ui
  • stable_diffusion
  • ui

Stability AI has released an open-source version of its DreamStudio text-to-image consumer application called StableStudio. The company intends to work with the broader community to create a world class user interface for generative AI that users fully control. DreamStudio was first conceived as an animation studio that shifted its focus to image generation with the arrival of Stable Diffusion in the summer of 2022. StableStudio enable local-first development through WebGPU and a desktop installation of its Stable Diffusion tool. It is also compatible with ControlNet tools and local inference through AUTOMATIC1111 stable-diffusion-webui tool.

StableStudio

5 months ago Permalink
cluster icon
  • ComfyUI: A stable diffusion GUI with a graph/nodes interface : ComfyUI is a powerful and modular stable diffusion GUI and backend that enables users to design and execute advanced stable diffusion pipelines using ...
  • AUTOMATIC1111 Stable Diffusion web UI : The Stable Diffusion WebUI offers a range of features for generating and processing images, including original txt2img and img2img modes, outpainting,...
  • Ainodes engine : aiNodes is a Python-based AI image/motion picture generator node engine that facilitates creativity in the creation of images and videos. The engine i...
  • Dreambooth-training-guide : Guide on using Dreambooth for dataset collection and basic Dreambooth settings. The author suggests fine-tuning Stable Diffusion 2.0 and recommends us...
  • Cutting Off Prompt Effect : This stable-diffusion-webui extension aims to limit the influence of certain tokens in language models by rewriting them as padding tokens. This is im...

Ainodes enginehttps://github.com/XmYx/ainodes-engine

  • visual_programming
  • stable_diffusion
  • ui
  • deforum
  • visual_programming
  • stable_diffusion
  • ui
  • deforum

aiNodes is a Python-based AI image/motion picture generator node engine that facilitates creativity in the creation of images and videos. The engine is fully modular and can download node packs on runtime. It also features RIFE and FILM interpolation integration, coloured background drop, and node-creation with IDE annotations. The installation process requires Python 3.10, Git, and an nVidia GPU with CUDA and drivers installed. AiNodes engine is an open-source desktop AI-based image/motion generator that supports various features such as Deforum, Stable Diffusion, Upscalers, Kandinsky, ControlNet, LORAs, Ti Embeddings, Hypernetworks, Background Separation, Human matting/masking, and Compositing, among others.

Ainodes engine

5 months ago Permalink
cluster icon
  • AUTOMATIC1111 Stable Diffusion web UI : The Stable Diffusion WebUI offers a range of features for generating and processing images, including original txt2img and img2img modes, outpainting,...
  • ComfyUI: A stable diffusion GUI with a graph/nodes interface : ComfyUI is a powerful and modular stable diffusion GUI and backend that enables users to design and execute advanced stable diffusion pipelines using ...
  • StableStudio: Community interface for generative AI : Stability AI has released an open-source version of its DreamStudio text-to-image consumer application called StableStudio. The company intends to wor...
  • Stable Diffusion Prompt book : Prompt engineering is the process of structuring words that can be interpreted and understood by a text-to-image model. This prompt book by OpenArt ai...
  • ControlNet v1.1: A complete guide - Stable Diffusion Art : A comprehensive guide to ControlNet v1.1, a Stable Diffusion model that allows users to control image compositions and human poses based on reference ...

Redream: Realtime Diffusion, using Automatic1111 Stable Diffusion APIhttps://github.com/Fictiverse/Redream

  • stable_diffusion
  • tool
  • api
  • video
  • stable_diffusion
  • tool
  • api
  • video

(Kind of) Realtime Stable Diffusion from a screen area using Automatic1111's API.

Redream

5 months ago Permalink
cluster icon
  • Defuser : the Stable Diffusion Plugin for Photoshop and Krita : The Stable Diffusion WebUI Plugin is a plugin for Photoshop and Krita that interfaces with AUTOMATIC1111's Stable Diffusion WebUI without the need to ...
  • Deforum Stable Diffusion : Deforum is a community of AI image synthesis developers, enthusiasts, and artists. They have created a notebook using Stable Diffusion and continue to...
  • EveryDream Tools : The EveryDream Tools repository contains data engineering tools for Stable Diffusion, an image project, and other image projects. The tools can be use...
  • Safe and Stable Ckpt2Safetensors Conversion Tool GUI : Safe & Stable is a user-friendly tool designed to convert stable diffusion checkpoint files (.ckpt) to the safer and more secure .safetensors format f...
  • Diffusion interactive explainer : The Diffusion Explainer tool is an interactive webpage that allows users to generate an image from a text prompt. Users have control over various hype...

ProFusion: Code for Enhancing Detail Preservation for Customized Text-to-Image Generationhttps://github.com/drboog/ProFusion

  • training
  • stable_diffusion
  • lora
  • dreambooth
  • styledrop
  • profusion
  • training
  • stable_diffusion
  • lora
  • dreambooth
  • styledrop
  • profusion

ProFusion is a new framework for customized text-to-image generation that preserves fine-grained image details without using regularization, as proposed in the paper. ProFusion includes PromptNet, an encoder network, and Fusion Sampling, a method that generates customized images based on a single user-provided image and text requirements. The paper explains how ProFusion works and provides experiments demonstrating its superior performance compared to existing approaches, while still meeting additional user-defined requirements.

ProFusion

5 months ago Permalink
cluster icon
  • StyleDrop: Text-To-Image Generation in Any Style : StyleDrop is a technology that generates images in any desired style using text-to-image transformer, Muse. The technology captures nuances of user-pr...
  • Civitai | Stable Diffusion models, embeddings, hypernetworks and more : The text introduces Civitai, a platform that allows users to share and discover resources for creating AI art. Civitai provides users with custom mode...
  • BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing : BLIP-Diffusion is a new model for generating and editing images based on text prompts and subject images. Unlike previous models, it uses a pre-traine...
  • SD Compendium : Extensive wiki about Stable diffusion.
  • IP-Adapter: The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt. : IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models. An IP-Adapter ...

VAE / Asymmetric_VQGANhttps://github.com/buxiangzhiren/Asymmetric_VQGAN

  • stable_diffusion
  • vae
  • stable_diffusion
  • vae

We propose the Asymmetric VQGAN, to preserve the information of conditional image input. Asymmetric VQGAN involves two core designs compared with the original VQGAN as shown in the figure. First, we introduce a conditional branch into the decoder of the VQGAN which aims to handle the conditional input for image manipulation tasks. Second, we design a larger decoder for VQGAN to better recover the losing details of the quantized codes.

Asymmetric_VQGAN

6 months ago Permalink
cluster icon
  • StableStudio: Community interface for generative AI : Stability AI has released an open-source version of its DreamStudio text-to-image consumer application called StableStudio. The company intends to wor...
  • GLIGEN:Open-Set Grounded Text-to-Image Generation. : GLIGEN is a new approach that enhances text-to-image generation models by allowing them to be conditioned on additional grounding inputs. It achieves ...
  • Ainodes engine : aiNodes is a Python-based AI image/motion picture generator node engine that facilitates creativity in the creation of images and videos. The engine i...
  • Stable diffusion performance on different GPU :
  • A dockerized, CPU-only AUTOMATIC1111 Web UI. : The Stable Diffusion Web UI has been dockerized in a self-contained, CPU-only version that includes all necessary dependencies and weighs 9.7GiB. The ...


(175)
1 / 5
Links per page
  • 20
  • 50
  • 100
Filter untagged links

 

 

 
Fold Fold all Expand Expand all Are you sure you want to delete this link? Are you sure you want to delete this tag? The personal, minimalist, super-fast, database free, bookmarking service by the Shaarli community