InstantID for StableDiffusion 1.5

This is unofficial implementation InstantID for StableDiffusion 1.5.
SD15 has a lot of finetuned models. So you can use all of this models with combination of instantid components to get awesome results.

Official InstantID works only with SDXL and contains code only for inference.
But this repository contains Training and Inference code.
Training process was used only 10M images from LAION-FACE 50M dataset (Original InstantID used 50M Laion-face + 10M custom images).
Feel free to adapt it for your personal purposes. I will be glad if somebody find it usefull.

Examples

Examples with epiCPhotoGasm model + styles from original InstantID.

Examples with Disney Pixar Cartoon Type A model + styles from original InstantID.

InstantID SD1.5 components are not compatible with InstantID SDXL. In this work model has been trained with additional facial keypoints information.
Keypoints visualization:

It is also possible to transfer different keypoints from other images.

Links:

How to use:

Clone this repo and install requirements.

git clone https://github.com/TheDenk/InstantID-SD1.5.git

cd InstantID-SD1.5 
pip install -r requirements.txt

Download models

clone StableDiffusion1.5 into models dir:

git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 ./models/stable-diffusion-v1-5

clone Instantid-SD1.5 models from HuggingFace.

git clone https://huggingface.co/TheDenk/InstantID-SD1.5 ./models/instantid-components

download antelopev2 archive from this post move to the models directory and unzip.

The folder tree should be like:

  .
  ├── models
  │   ├── stable-diffusion-v1-5/*
  │   ├── antelopev2/*.onnx
  │   ├── instantid-components/*.ckpt
  │   └── additional-unets/*.safetensors (optional)  
  ├── instantid
  ├── gradio
  ├── inference.py
  ├── inference.ipynb
  └── README.md

Run inference.py example

SIMPLE RUN

CUDA_VISIBLE_DEVICES="0" python3 inference.py \
    --image_path=examples/faces/rock.jpg \
    --prompt="the professional high quality photo of the man, high quality, best quality, masterpeace" \
    --style="Film Noir" \
    --height=640 \
    --width=768 \
    --num_inference_steps=25 \
    --guidance_scale=8.0 \
    --num_images_per_prompt=4

SELECT MODELS

CUDA_VISIBLE_DEVICES="0" python3 inference.py --pretrained_model_path=models/stable-diffusion-v1-5 \
    --adapter_ckpt_path=models/instantid-components/ip-state.ckpt \
    --image_proj_ckpt_path=models/instantid-components/image_proj.ckpt \
    --controlnet_ckpt_path=models/instantid-components/controlnet.ckpt \
    --additional_unet_path=models/additional-unets/epicphotogasm_lastUnicorn.safetensors \
    --image_path=examples/faces/rock.jpg \
    --prompt="the professional high quality photo of the man, best quality, masterpeace" \
    --style="Film Noir" \
    --height=640 \
    --width=768 \
    --num_inference_steps=25 \
    --guidance_scale=8.0 \
    --num_images_per_prompt=4

Run gradio demo

SIMPLE RUN

CUDA_VISIBLE_DEVICES="0" python3 gradio/app.py

SELECT MODELS

CUDA_VISIBLE_DEVICES="0" python3 gradio/app.py --pretrained_model_path=models/stable-diffusion-v1-5 \
    --adapter_ckpt_path=models/instantid-components/ip-state.ckpt \
    --image_proj_ckpt_path=models/instantid-components/image_proj.ckpt \
    --controlnet_ckpt_path=models/instantid-components/controlnet.ckpt \
    --additional_unet_path=models/additional-unets/epicphotogasm_lastUnicorn.safetensors

Or use code in jupyter-notebook (inference.ipynb file).

Training

All models have been trained 780K steps on 3 GPU A6000 with batch_size=20, resolution=512, lr=1e-5 and using only 10M images from LAION-FACE dataset.

Steps for training:

1 Dowloand data from LAION-FACE and prepare images using official instruction.

2 Filter dataset with `train/process_laion_dataset.py` script. It is using multiprocessing to increase processing speed. Example:

CUDA_VISIBLE_DEVICES="0" python3 process_laion_dataset.py \
    --data_root={DATASET_ROOT} \
    --split_name=split_00000 \
    --n_jobs=4

Replace {DATASET_ROOT} with your own path to LAION-Face dataset. For example ../LAION-Face.
It creates four directories in your {DATASET_ROOT}: extracted_images, extracted_keypoints, embeddings, csv.

extracted_images contains filtered and resized *.jpg images.
extracted_keypoints contains *.jpg images with facial landmarks.
embeddings contains *.pt files with extracted facial embeddings, landmarks, boxes and some other information.
csv contains *.csv files with filtered images paths and textual descriptions.

The folder tree should be like:

  .
  └──{DATASET_ROOT}
      ├── extracted_images/*.jpg
      ├── extracted_keypoints/*.jpg
      ├── embeddings/*.pt
      └── csv/*.csv

This script also filter data and skip images which contains too small faces and small images.
You can regulate it with min_h, min_w, min_head_coef parameters. Default min_head_coef=0.3, min_h=512 and min_w=512.

3 Run train.py file

CUDA_VISIBLE_DEVICES="0" accelerate launch train.py \
 --dataset_root="{DATASET_ROOT}" \
 --pretrained_model_name_or_path="./models/stable-diffusion-v1-5" \
 --output_dir="./output/instant_training" \
 --resolution=512 \
 --learning_rate=1e-5 \
 --validation_prompt "the professional photo of a beautifull girl, high resolution, awesome detailed, 4k, 8k" "beautifull redhead girl, high resolution, awesome detailed, 4k, 8k" \
 --validation_negative_prompt "lowres, worst quality, low quality" "lowres, worst quality, low quality" \
 --validation_image "./examples/valid/valid_keypoints.png" \
 --valid_embeddings "./examples/valid/valid_embeddings.pt" \
 --train_batch_size=20 \
 --dataloader_num_workers=32 \
 --validation_steps=2500 \
 --num_validation_images=4 \
 --num_train_epochs=1 \
 --checkpointing_steps=5000 \
 --mixed_precision=bf16

The validation image was taken from the LAION-Face dataset (just random image with extracted data).

More examples

Using only models without special style prompts.

Examples with Aniflatmix model + styles from original InstantID.

Acknowledgements

InstantID and InstantX Team.
IP-Adapter and ControlNet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InstantID for StableDiffusion 1.5

Examples

How to use:

Download models

Run inference.py example

SIMPLE RUN

SELECT MODELS

Run gradio demo

SIMPLE RUN

SELECT MODELS

Training

Steps for training:

1 Dowloand data from LAION-FACE and prepare images using official instruction.

2 Filter dataset with `train/process_laion_dataset.py` script. It is using multiprocessing to increase processing speed. Example:

3 Run train.py file

More examples

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
examples		examples
gradio		gradio
instantid		instantid
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.ipynb		inference.ipynb
inference.py		inference.py
process_laion_dataset.py		process_laion_dataset.py
requirements.txt		requirements.txt
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

InstantID for StableDiffusion 1.5

Examples

How to use:

Download models

Run inference.py example

SIMPLE RUN

SELECT MODELS

Run gradio demo

SIMPLE RUN

SELECT MODELS

Training

Steps for training:

1 Dowloand data from LAION-FACE and prepare images using official instruction.

2 Filter dataset with train/process_laion_dataset.py script. It is using multiprocessing to increase processing speed. Example:

3 Run train.py file

More examples

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2 Filter dataset with `train/process_laion_dataset.py` script. It is using multiprocessing to increase processing speed. Example:

Packages