sw-yx / prompt-eng

среда, 19 октября 2022 г. в 00:36:51

https://github.com/sw-yx/prompt-eng

notes for prompt engineering

prompt-eng

notes for prompt engineering

Table of Contents

Motivational Use Cases
Top Prompt Engineering Reads
Tooling
Communities
Stable Diffusion
How SD Works - Internals and Studies
SD Results
- Img2Img
Hardware requirements
SD vs DallE vs MJ
Misc

Motivational Use Cases

images
- https://mpost.io/best-100-stable-diffusion-prompts-the-most-beautiful-ai-text-to-image-prompts/
video
- img2img of famous movie scenes (lalaland)
- virtual fashion (karenxcheng)
- evolution of scenes (xander)
- outpainting https://twitter.com/orbamsterdam/status/1568200010747068417?s=21&t=rliacnWOIjJMiS37s8qCCw
- webUI img2img collaboration https://twitter.com/_akhaliq/status/1563582621757898752
- image to video with rotation https://twitter.com/TomLikesRobots/status/1571096804539912192
- "prompt paint" https://twitter.com/1littlecoder/status/1572573152974372864
- music videos video, colab
- direct text2video project
text-to-3d https://twitter.com/_akhaliq/status/1575541930905243652
- https://dreamfusion3d.github.io/
- open source impl: https://github.com/ashawkey/stable-dreamfusion

Tooling

Prompt Generators:
- https://huggingface.co/succinctly/text2image-prompt-generator
  - This is a GPT-2 model fine-tuned on the succinctly/midjourney-prompts dataset, which contains 250k text prompts that users issued to the Midjourney text-to-image service over a month period. This prompt generator can be used to auto-complete prompts for any text-to-image model (including the DALL·E family)
- Prompt Parrot https://colab.research.google.com/drive/1GtyVgVCwnDfRvfsHbeU0AlG-SgQn1p8e?usp=sharing
  - This notebook is designed to train language model on a list of your prompts,generate prompts in your style, and synthesize wonderful surreal images! ✨
- https://twitter.com/stuhlmueller/status/1575187860063285248
  - The Interactive Composition Explorer (ICE), a Python library for writing and debugging compositional language model programs https://github.com/oughtinc/ice
- The Factored Cognition Primer, a tutorial that shows using examples how to write such programs https://primer.ought.org
- Prompt Explorer
  - https://twitter.com/fabianstelzer/status/1575088140234428416
  - https://docs.google.com/spreadsheets/d/1oi0fwTNuJu5EYM2DIndyk0KeAY8tL6-Qd1BozFb9Zls/edit#gid=1567267935
- Prompt generator https://www.aiprompt.io/

misc

Edsynth and DAIN for coherence
FILM: Frame Interpolation for Large Motion (github)
Depth Mapping
- examples: https://twitter.com/TomLikesRobots/status/1566152352117161990
Art program plugins
Papers
- 2015: Deep Unsupervised Learning using Nonequilibrium Thermodynamics founding paper of diffusion models
- Textual Inversion: https://arxiv.org/abs/2208.01618 (impl: https://github.com/rinongal/textual_inversion)
- 2017: Attention is all you need
- https://dreambooth.github.io/
  - productized as dreambooth https://twitter.com/psuraj28/status/1575123562435956740
  - https://github.com/JoePenna/Dreambooth-Stable-Diffusion
- very good BLOOM model overview

Communities

StableDiffusion Discord https://discord.com/invite/stablediffusion
https://reddit.com/r/stableDiffusion
Akhaliq Discord: https://discord.gg/nYqfg4gnBt
Deforum Discord https://discord.gg/upmXXsrwZc
Lexica Discord https://discord.com/invite/bMHBjJ9wRh
Midjourney's discord
https://stablehorde.net/

Stable Diffusion

stable diffusion specific notes

Required reading:

param intuitionhttps://www.reddit.com/r/StableDiffusion/comments/x41n87/how_to_get_images_that_dont_suck_a/
CLI commands https://www.assemblyai.com/blog/how-to-run-stable-diffusion-locally-to-generate-images/#script-options

SD Distros

Installer Distros: Programs that bundle Stable Diffusion in an installable program, no separate setup and the least amount of git/technical skill needed, usually bundling one or more UI
- Diffusion Bee: Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
- https://www.charl-e.com/: Stable Diffusion on your Mac in 1 click. (tweet)
- https://github.com/cmdr2/stable-diffusion-ui: Easiest 1-click way to install and use Stable Diffusion on your own computer. Provides a browser UI for generating images from text prompts and images. Just enter your text prompt, and see the generated image. (Linux, Windows, no Mac)
- https://nmkd.itch.io/t2i-gui: A basic (for now) Windows 10/11 64-bit GUI to run Stable Diffusion, a machine learning toolkit to generate images from text, locally on your own hardware. As of right now, this program only works on Nvidia GPUs! AMD GPUs are not supported. In the future this might change.
- imaginAIry 🤖🧠: Pythonic generation of stable diffusion images with just pip install imaginairy. "just works" on Linux and macOS(M1) (and maybe windows). Memory efficiency improvements, prompt-based editing, face enhancement, upscaling, tiled images, img2img, prompt matrices, prompt variables, BLIP image captions, comes with dockerfile/colab. Has unit tests.
- Fictiverse/Windows-GUI: A windows interface for stable diffusion
- https://github.com/razzorblade/stable-diffusion-gui: dormant now.
Web Distros
- https://www.mage.space/
- https://dreamlike.art/ has img2img
- https://inpainter.vercel.app/paint for inpainting
- https://promptart.labml.ai/feed
- https://www.strmr.com/ dreambooth tuning for $3
- https://www.findanything.app browser extension that adds SD predictions alongside Google search
- https://www.drawanything.app
Twitter Bots
- https://twitter.com/diffusionbot
- https://twitter.com/m1guelpf/status/1569487042345861121
Windows "retard guides"
- https://rentry.org/voldy
- https://rentry.org/GUItard

SD Major forks

Main Stable Diffusion repo: https://github.com/CompVis/stable-diffusion

Name/Link	Stars	Description
AUTOMATIC1111	9700	The most well known fork. features: https://github.com/AUTOMATIC1111/stable-diffusion-webui#features launch announcement https://www.reddit.com/r/StableDiffusion/comments/x28a76/stable_diffusion_web_ui/. M1 mac instructions https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Installation-on-Apple-Silicon
Disco Diffusion	5600	A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations.
sd-webui (formerly hlky fork)	5100	A fully-integrated and easy way to work with Stable Diffusion right from a browser window. Long list of UI and SD features (incl textual inversion, alternative samplers, prompt matrix): https://github.com/sd-webui/stable-diffusion-webui#project-features
InvokeAI (formerly lstein fork)	3400	This version of Stable Diffusion features a slick WebGUI, an interactive command-line script that combines text2img and img2img functionality in a "dream bot" style interface, and multiple features and other enhancements. It runs on Windows, Mac and Linux machines, with GPU cards with as little as 4 GB of RAM.
XavierXiao/Dreambooth-Stable-Diffusion	2400	Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion. Dockerized: https://github.com/smy20011/dreambooth-docker
Basujindal: Optimized Stable Diffusion	2100	This repo is a modified version of the Stable Diffusion repo, optimized to use less VRAM than the original by sacrificing inference speed. img2img and txt2img and inpainting under 2.4GB VRAM
stablediffusion-infinity	1900	Outpainting with Stable Diffusion on an infinite canvas. This project mainly works as a proof of concept.
Waifu Diffusion (huggingface, replicate)	1100	stable diffusion finetuned on weeb stuff. "A model trained on danbooru (anime/manga drawing site with also lewds and nsfw on it) over 56k images.Produces FAR BETTER results if you're interested in getting manga and anime stuff out of stable diffusion."
AbdBarho/stable-diffusion-webui-docker	929	Easy Docker setup for Stable Diffusion with both Automatic1111 and hlky UI included. HOWEVER - no mac support yet AbdBarho/stable-diffusion-webui-docker#35
fast-stable-diffusion	753	+25-50% speed increase + memory efficient + DreamBooth
imaginAIry 🤖🧠	639	Pythonic generation of stable diffusion images with just `pip install imaginairy`. "just works" on Linux and macOS(M1) (and maybe windows). Memory efficiency improvements, prompt-based editing, face enhancement, upscaling, tiled images, img2img, prompt matrices, prompt variables, BLIP image captions, comes with dockerfile/colab. Has unit tests.
neonsecret/stable-diffusion	546	This repo is a modified version of the Stable Diffusion repo, optimized to use less VRAM than the original by sacrificing inference speed. Also I invented the sliced atttention technique, which allows to push the model's abilities even further. It works by automatically determining the slice size from your vram and image size and then allocating it one by one accordingly. You can practically generate any image size, it just depends on the generation speed you are willing to sacrifice.
Deforum Stable Diffusion	347	Animating prompts with stable diffusion. replicate demo: https://replicate.com/deforum/deforum_stable_diffusion
Doggettx/stable-diffusion	137	Allows to use resolutions that require up to 64x more VRAM than possible on the default CompVis build.

SD in Other languages

Other Lists of Forks

Dormant projects, for historical/research interest:

https://colab.research.google.com/drive/1AfAmwLMd_Vx33O9IwY2TmO9wKZ8ABRRa
https://colab.research.google.com/drive/1kw3egmSn-KgWsikYvOMjJkVDsPLjEMzl
bfirsh/stable-diffusion No longer actively maintained byt was the first to work on M1 Macs - blog, tweet, can also look at environment-mac.yaml from https://github.com/fragmede/stable-diffusion/blob/mps_consistent_seed/environment-mac.yaml

Misc SD UI's

UI's that dont come with their own SD distro, just shelling out to one

UI Name/Link	Stars	Self-Description
ahrm/UnstableFusion	815	UnstableFusion is a desktop frontend for Stable Diffusion which combines image generation, inpainting, img2img and other image editing operation into a seamless workflow. https://www.youtube.com/watch?v=XLOhizAnSfQ&t=1s
breadthe/sd-buddy	165	Companion desktop app for the self-hosted M1 Mac version of Stable Diffusion, with Svelte and Tauri
leszekhanusz/diffusion-ui	65	This is a web interface frontend for the generation of images using diffusion models. The goal is to provide an interface to online and offline backends doing image generation and inpainting like Stable Diffusion.
GenerationQ	21	GenerationQ (for "image generation queue") is a cross-platform desktop application (screens below) designed to provide a general purpose GUI for generating images via text2img and img2img models. Its primary target is Stable Diffusion but since there is such a variety of forked programs with their own particularities, the UI for configuring image generation tasks is designed to be generic enough to accommodate just about any script (even non-SD models).

SD Prompt galleries and search engines

🌟 Lexica: Content-based search powered by OpenAI's CLIP model. Seed, CFG, Dimensions.
https://synesthetic.ai/ SD focused
https://visualise.ai/ Create and share image prompts. DALL-E, Midjourney, Stable Diffusion
https://nyx.gallery/
OpenArt: Content-based search powered by OpenAI's CLIP model. Favorites.
PromptHero: Random wall. Seed, CFG, Dimensions, Steps. Favorites.
Libraire: Seed, CFG, Dimensions, Steps.
Krea: modifiers focused UI. Favorites. Gives prompt suggestions and allows to create prompts over Stable diffusion, Waifu Diffusion and Disco diffusion. Really quick and useful
Avyn: Search engine and generator.
Pinegraph: discover, create and edit with Stable/Disco/Waifu diffusion models.
Phraser: text and image search.
https://arthub.ai/
https://pagebrain.ai/promptsearch/
https://avyn.com/
https://dallery.gallery/
The Ai Art: gallery for modifiers.
urania.ai: Top 500 Artists gallery, sorted by image count. With modifiers/styles.
Generrated: DALL•E 2 table gallery sorted by visual arts media.
Artist Studies by @remi_durant: gallery and Search.
CLIP Ranked Artists: gallery sorted by weight/strength.
https://promptbase.com/ Selling prompts that produce desirable results
https://publicprompts.art/ very basic/limited but some good prompts. promptbase competitor

SD Visual search

Lexica: enter an image URL in the search bar. Or next to q=. Example
Phraser: image icon at the right.
same.energy
Yandex, Bing, Google, Tineye, iqdb: reverse and similar image search engines.
Pinterest
dessant/search-by-image: Open-source browser extension for reverse image search.

SD Prompt generators

promptoMANIA: Visual modifiers. Great selection. With weight setting.
Phase.art: Visual modifiers. SD Generator and share.
Phraser: Visual modifiers.
AI Text Prompt Generator
Dynamic Prompt generator
succinctly/text2image: GPT-2 Midjourney trained text completion.
Prompt Parrot colab: Train and generate prompts.
cmdr2: 1-click SD installation with image modifiers selection.

How SD Works - Internals and Studies

SD Results

Img2Img

A black and white photo of a young woman, studio lighting, realistic, Ilford HP5 400
- https://twitter.com/TomLikesRobots/status/1566027217892671488

Hardware requirements

https://news.ycombinator.com/item?id=32642255#32646761
- For something like this, you ideally would want a powerful GPU with 12-24gb VRAM.
- A $500 RTX 3070 with 8GB of VRAM can generate 512x512 images with 50 steps in 7 seconds.

SD vs DallE vs MJ

DallE banned so SD https://twitter.com/almost_digital/status/1556216820788609025?s=20&t=GCU5prherJvKebRrv9urdw

Misc

Imagen
- https://www.youtube.com/watch?v=R_f-v6prMqI
Whisper
- https://huggingface.co/spaces/sensahin/YouWhisper YouWhisper converts Youtube videos to text using openai/whisper.
- https://twitter.com/jeffistyping/status/1573145140205846528 youtube whipserer
- multilingual subtitles https://twitter.com/1littlecoder/status/1573030143848722433
- video subtitles https://twitter.com/m1guelpf/status/1574929980207034375
- you can join whisper to stable diffusion for reasons https://twitter.com/fffiloni/status/1573733520765247488/photo/1
- known problems https://twitter.com/lunixbochs/status/1574848899897884672 (edge case with catastrophic failures)
textually guided audio https://twitter.com/FelixKreuk/status/1575846953333579776
Codegen
- CodegeeX https://twitter.com/thukeg/status/1572218413694726144
- https://github.com/salesforce/CodeGen https://joel.tools/codegen/
pdf to structured data https://www.impira.com/blog/hey-machine-whats-my-invoice-total
text to Human Motion diffusion https://twitter.com/GuyTvt/status/1577947409551851520
- abs: https://arxiv.org/abs/2209.14916
- project page: https://guytevet.github.io/mdm-page/