transformers/examples/research_projects/vqgan-clip
Klaus Hipp fe3df9d5b3
[Docs] Add language identifiers to fenced code blocks (#28955)
Add language identifiers to code blocks
2024-02-12 10:48:31 -08:00
..
README.md [Docs] Add language identifiers to fenced code blocks (#28955) 2024-02-12 10:48:31 -08:00
VQGAN_CLIP.py Apply ruff flake8-comprehensions (#21694) 2023-02-22 09:14:54 +01:00
img_processing.py Add VQGAN-CLIP research project (#21329) 2023-02-02 14:45:35 -05:00
loaders.py Apply ruff flake8-comprehensions (#21694) 2023-02-22 09:14:54 +01:00
requirements.txt Add VQGAN-CLIP research project (#21329) 2023-02-02 14:45:35 -05:00
utils.py Add VQGAN-CLIP research project (#21329) 2023-02-02 14:45:35 -05:00

README.md

Simple VQGAN CLIP

Author: @ErwannMillon

This is a very simple VQGAN-CLIP implementation that was built as a part of the Face Editor project . This simplified version allows you to generate or edit images using text with just three lines of code. For a more full featured implementation with masking, more advanced losses, and a full GUI, check out the Face Editor project.

By default this uses a CelebA checkpoint (for generating/editing faces), but also has an imagenet checkpoint that can be loaded by specifying vqgan_config and vqgan_checkpoint when instantiating VQGAN_CLIP.

Learning rate and iterations can be set by modifying vqgan_clip.lr and vqgan_clip.iterations .

You can edit images by passing image_path to the generate function. See the generate function's docstring to learn more about how to format prompts.

Usage

The easiest way to test this out is by using the Colab demo

To install locally:

  • Clone this repo
  • Install git-lfs (ubuntu: sudo apt-get install git-lfs , MacOS: brew install git-lfs)

In the root of the repo run:

conda create -n vqganclip python=3.8
conda activate vqganclip
git-lfs install
git clone https://huggingface.co/datasets/erwann/face_editor_model_ckpt model_checkpoints
pip install -r requirements.txt

Generate new images

from VQGAN_CLIP import VQGAN_CLIP
vqgan_clip = VQGAN_CLIP()
vqgan_clip.generate("a picture of a smiling woman")

Edit an image

To get a test image, run git clone https://huggingface.co/datasets/erwann/vqgan-clip-pic test_images

To edit:

from VQGAN_CLIP import VQGAN_CLIP
vqgan_clip = VQGAN_CLIP()

vqgan_clip.lr = .07
vqgan_clip.iterations = 15
vqgan_clip.generate(
    pos_prompts= ["a picture of a beautiful asian woman", "a picture of a woman from Japan"],
    neg_prompts=["a picture of an Indian person", "a picture of a white person"],
    image_path="./test_images/face.jpeg",
    show_intermediate=True,
    save_intermediate=True,
)

Make an animation from the most recent generation

vqgan_clip.make_animation()

Features:

  • Positive and negative prompts
  • Multiple prompts
  • Prompt Weights
  • Creating GIF animations of the transformations
  • Wandb logging