Skip to content

Commit 35d3f0e

Browse files
authored
Merge branch 'lstein:main' into main
2 parents 84c1034 + 0433b3d commit 35d3f0e

11 files changed

Lines changed: 333 additions & 279 deletions

File tree

README-Mac-MPS.md

Lines changed: 61 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,19 @@
11
# Apple Silicon Mac Users
22

33
Several people have gotten Stable Diffusion to work on Apple Silicon
4-
Macs using Anaconda. I've gathered up most of their instructions and
5-
put them in this fork (and readme). I haven't tested anything besides
6-
Anaconda, and I've read about issues with things like miniforge, so if
7-
you have an issue that isn't dealt with in this fork then head on over
8-
to the [Apple
9-
Silicon](https://github.com/CompVis/stable-diffusion/issues/25) issue
10-
on GitHub (that page is so long that GitHub hides most of it by
11-
default, so you need to find the hidden part and expand it to view the
12-
whole thing). This fork would not have been possible without the work
13-
done by the people on that issue.
4+
Macs using Anaconda, miniforge, etc. I've gathered up most of their instructions and
5+
put them in this fork (and readme). Things have moved really fast and so these
6+
instructions change often. Hopefully things will settle down a little.
7+
8+
There's several places where people are discussing Apple
9+
MPS functionality: [the original CompVis
10+
issue](https://github.com/CompVis/stable-diffusion/issues/25), and generally on
11+
[lstein's fork](https://github.com/lstein/stable-diffusion/).
1412

1513
You have to have macOS 12.3 Monterey or later. Anything earlier than that won't work.
1614

17-
BTW, I haven't tested any of this on Intel Macs.
15+
BTW, I haven't tested any of this on Intel Macs but I have read that one person
16+
got it to work.
1817

1918
How to:
2019

@@ -27,38 +26,41 @@ ln -s /path/to/ckpt/sd-v1-1.ckpt models/ldm/stable-diffusion-v1/model.ckpt
2726
2827
conda env create -f environment-mac.yaml
2928
conda activate ldm
29+
30+
python scripts/preload_models.py
31+
python scripts/orig_scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms
3032
```
3133

32-
These instructions are identical to the main repo except I added
33-
environment-mac.yaml because Mac doesn't have cudatoolkit.
34+
We have not gotten lstein's dream.py to work yet.
3435

3536
After you follow all the instructions and run txt2img.py you might get several errors. Here's the errors I've seen and found solutions for.
3637

38+
### Is it slow?
39+
40+
Be sure to specify 1 sample and 1 iteration.
41+
42+
python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
43+
3744
### Doesn't work anymore?
3845

39-
We are using PyTorch nightly, which includes support for MPS. I don't
40-
know exactly how Anaconda does updates, but I woke up one morning and
41-
Stable Diffusion crashed and I couldn't think of anything I did that
42-
would've changed anything the night before, when it worked. A day and
43-
a half later I finally got it working again. I don't know what changed
44-
overnight. PyTorch-nightly changes overnight but I'm pretty sure I
45-
didn't manually update it. Either way, things are probably going to be
46-
bumpy on Apple Silicon until PyTorch releases a firm version that we
47-
can lock to.
46+
PyTorch nightly includes support for MPS. Because of this, this setup is
47+
inherently unstable. One morning I woke up and it no longer worked no matter
48+
what I did until I switched to miniforge. However, I have another Mac that works
49+
just fine with Anaconda. If you can't get it to work, please search a little
50+
first because many of the errors will get posted and solved. If you can't find
51+
a solution please [create an issue](https://github.com/lstein/stable-diffusion/issues).
4852

49-
To manually update to the latest version of PyTorch nightly (which could fix issues), run this command.
53+
One debugging step is to update to the latest version of PyTorch nightly.
5054

5155
conda install pytorch torchvision torchaudio -c pytorch-nightly
5256

53-
## Debugging?
57+
Or you can clean everything up.
5458

55-
Tired of waiting for your renders to finish before you can see if it
56-
works? Reduce the steps! The picture wont look like anything but if it
57-
finishes, hey, it works! This could also help you figure out if you've
58-
got a memory problem, because I'm betting 1 step doesn't use much
59-
memory.
59+
conda clean --yes --all
60+
61+
Or you can reset Anaconda.
6062

61-
python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 1
63+
conda update --force-reinstall -y -n base -c defaults conda
6264

6365
### "No module named cv2" (or some other module)
6466

@@ -83,6 +85,23 @@ globally.
8385

8486
You might also need to install Rust (I mention this again below).
8587

88+
89+
### Debugging?
90+
91+
Tired of waiting for your renders to finish before you can see if it
92+
works? Reduce the steps! The image quality will be horrible but at least you'll
93+
get quick feedback.
94+
95+
python ./scripts/txt2img.py --prompt "ocean" --ddim_steps 5 --n_samples 1 --n_iter 1
96+
97+
### MAC: torch._C' has no attribute '_cuda_resetPeakMemoryStats' #234
98+
99+
We haven't fixed gotten dream.py to work on Mac yet.
100+
101+
### OSError: Can't load tokenizer for 'openai/clip-vit-large-patch14'...
102+
103+
python scripts/preload_models.py
104+
86105
### "The operator [name] is not current implemented for the MPS device." (sic)
87106

88107
Example error.
@@ -92,9 +111,7 @@ Example error.
92111
NotImplementedError: The operator 'aten::index.Tensor' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on [https://github.com/pytorch/pytorch/issues/77764](https://github.com/pytorch/pytorch/issues/77764). As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
93112
```
94113

95-
Just do what it says:
96-
97-
export PYTORCH_ENABLE_MPS_FALLBACK=1
114+
The lstein branch includes this fix in [environment-mac.yaml](https://github.com/lstein/stable-diffusion/blob/main/environment-mac.yaml).
98115

99116
### "Could not build wheels for tokenizers"
100117

@@ -104,14 +121,17 @@ I have not seen this error because I had Rust installed on my computer before I
104121

105122
### How come `--seed` doesn't work?
106123

124+
First this:
125+
107126
> Completely reproducible results are not guaranteed across PyTorch
108127
releases, individual commits, or different platforms. Furthermore,
109128
results may not be reproducible between CPU and GPU executions, even
110129
when using identical seeds.
111130

112131
[PyTorch docs](https://pytorch.org/docs/stable/notes/randomness.html)
113132

114-
There is an [open issue](https://github.com/pytorch/pytorch/issues/78035) (as of August 2022) in pytorch regarding gradient inconsistency. I am guessing that's what is causing this.
133+
Second, we might have a fix that at least gets a consistent seed sort of. We're
134+
still working on it.
115135

116136
### libiomp5.dylib error?
117137

@@ -137,6 +157,8 @@ sort). [There's more
137157
suggestions](https://stackoverflow.com/questions/53014306/error-15-initializing-libiomp5-dylib-but-found-libiomp5-dylib-already-initial),
138158
like uninstalling tensorflow and reinstalling. I haven't tried them.
139159

160+
Since I switched to miniforge I haven't seen the error.
161+
140162
### Not enough memory.
141163

142164
This seems to be a common problem and is probably the underlying
@@ -174,10 +196,10 @@ Actually, this could be happening because there's not enough RAM. You could try
174196

175197
### My images come out black
176198

177-
I haven't solved this issue. I just throw away my black
178-
images. There's a [similar
179-
issue](https://github.com/CompVis/stable-diffusion/issues/69) on CUDA
180-
GPU's where the images come out green. Maybe it's the same issue?
199+
We might have this fixed, we are still testing.
200+
201+
There's a [similar issue](https://github.com/CompVis/stable-diffusion/issues/69)
202+
on CUDA GPU's where the images come out green. Maybe it's the same issue?
181203
Someone in that issue says to use "--precision full", but this fork
182204
actually disables that flag. I don't know why, someone else provided
183205
that code and I don't know what it does. Maybe the `model.half()`
@@ -204,25 +226,4 @@ What? Intel? On an Apple Silicon?
204226
The processor must support the Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions.
205227
The processor must support the Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
206228

207-
This fixed it for me:
208-
209-
conda clean --yes --all
210-
211-
### Still slow?
212-
213-
I changed the defaults of n_samples and n_iter to 1 so that it uses
214-
less RAM and makes less images so it will be faster the first time you
215-
use it. I don't actually know what n_samples does internally, but I
216-
know it consumes a lot more RAM. The n_iter flag just loops around the
217-
image creation code, so it shouldn't consume more RAM (it should be
218-
faster if you're going to do multiple images because the libraries and
219-
model will already be loaded--use a prompt file to get this speed
220-
boost).
221-
222-
These flags are the default sample and iter settings in this fork/branch:
223-
224-
~~~~
225-
python scripts/txt2img.py --prompt "ocean" --n_samples=1 --n_iter=1
226-
~~~
227-
228-
229+
This was actually the issue that I couldn't solve until I switched to miniforge.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -605,7 +605,7 @@ This will bring your local copy into sync with the remote one.
605605

606606
## Macintosh
607607

608-
See (README-Mac-MPS)[README-Mac-MPS.md] for instructions.
608+
See [README-Mac-MPS](README-Mac-MPS.md) for instructions.
609609

610610
# Simplified API for text to image generation
611611

configs/stable-diffusion/v1-finetune.yaml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ model:
5252
ddconfig:
5353
double_z: true
5454
z_channels: 4
55-
resolution: 512
55+
resolution: 256
5656
in_channels: 3
5757
out_ch: 3
5858
ch: 128
@@ -74,7 +74,7 @@ data:
7474
target: main.DataModuleFromConfig
7575
params:
7676
batch_size: 1
77-
num_workers: 16
77+
num_workers: 2
7878
wrap: false
7979
train:
8080
target: ldm.data.personalized.PersonalizedBase
@@ -105,4 +105,5 @@ lightning:
105105

106106
trainer:
107107
benchmark: True
108-
max_steps: 6100
108+
max_steps: 4000
109+

ldm/dream/image_util.py

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
from math import sqrt, floor, ceil
12
from PIL import Image
23

34
class InitImageResizer():
@@ -49,6 +50,26 @@ def resize(self,width=None,height=None) -> Image:
4950
new_image = Image.new('RGB',(width,height))
5051
new_image.paste(resized_image,((width-rw)//2,(height-rh)//2))
5152

53+
print(f'>> Resized image size to {width}x{height}')
54+
5255
return new_image
5356

54-
57+
def make_grid(image_list, rows=None, cols=None):
58+
image_cnt = len(image_list)
59+
if None in (rows, cols):
60+
rows = floor(sqrt(image_cnt)) # try to make it square
61+
cols = ceil(image_cnt / rows)
62+
width = image_list[0].width
63+
height = image_list[0].height
64+
65+
grid_img = Image.new('RGB', (width * cols, height * rows))
66+
i = 0
67+
for r in range(0, rows):
68+
for c in range(0, cols):
69+
if i >= len(image_list):
70+
break
71+
grid_img.paste(image_list[i], (c * width, r * height))
72+
i = i + 1
73+
74+
return grid_img
75+

ldm/dream/pngwriter.py

Lines changed: 20 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -2,95 +2,42 @@
22
Two helper classes for dealing with PNG images and their path names.
33
PngWriter -- Converts Images generated by T2I into PNGs, finds
44
appropriate names for them, and writes prompt metadata
5-
into the PNG. Intended to be subclassable in order to
6-
create more complex naming schemes, including using the
7-
prompt for file/directory names.
5+
into the PNG.
86
PromptFormatter -- Utility for converting a Namespace of prompt parameters
97
back into a formatted prompt string with command-line switches.
108
"""
119
import os
1210
import re
13-
from math import sqrt, floor, ceil
14-
from PIL import Image, PngImagePlugin
11+
from PIL import PngImagePlugin
1512

1613
# -------------------image generation utils-----
1714

1815

1916
class PngWriter:
20-
def __init__(self, outdir, prompt=None):
17+
def __init__(self, outdir):
2118
self.outdir = outdir
22-
self.prompt = prompt
23-
self.filepath = None
24-
self.files_written = []
2519
os.makedirs(outdir, exist_ok=True)
2620

27-
def write_image(self, image, seed, upscaled=False):
28-
self.filepath = self.unique_filename(
29-
seed, upscaled, self.filepath
30-
) # will increment name in some sensible way
31-
try:
32-
prompt = f'{self.prompt} -S{seed}'
33-
self.save_image_and_prompt_to_png(image, prompt, self.filepath)
34-
except IOError as e:
35-
print(e)
36-
if not upscaled:
37-
self.files_written.append([self.filepath, seed])
38-
39-
def unique_filename(self, seed, upscaled=False, previouspath=None):
40-
revision = 1
41-
42-
if previouspath is None:
43-
# sort reverse alphabetically until we find max+1
44-
dirlist = sorted(os.listdir(self.outdir), reverse=True)
45-
# find the first filename that matches our pattern or return 000000.0.png
46-
filename = next(
47-
(f for f in dirlist if re.match('^(\d+)\..*\.png', f)),
48-
'0000000.0.png',
49-
)
50-
basecount = int(filename.split('.', 1)[0])
51-
basecount += 1
52-
filename = f'{basecount:06}.{seed}.png'
53-
return os.path.join(self.outdir, filename)
54-
55-
else:
56-
basename = os.path.basename(previouspath)
57-
x = re.match('^(\d+)\..*\.png', basename)
58-
if not x:
59-
return self.unique_filename(seed, upscaled, previouspath)
60-
61-
basecount = int(x.groups()[0])
62-
series = 0
63-
finished = False
64-
while not finished:
65-
series += 1
66-
filename = f'{basecount:06}.{seed}.png'
67-
path = os.path.join(self.outdir, filename)
68-
finished = not os.path.exists(path)
69-
return os.path.join(self.outdir, filename)
70-
71-
def save_image_and_prompt_to_png(self, image, prompt, path):
21+
# gives the next unique prefix in outdir
22+
def unique_prefix(self):
23+
# sort reverse alphabetically until we find max+1
24+
dirlist = sorted(os.listdir(self.outdir), reverse=True)
25+
# find the first filename that matches our pattern or return 000000.0.png
26+
existing_name = next(
27+
(f for f in dirlist if re.match('^(\d+)\..*\.png', f)),
28+
'0000000.0.png',
29+
)
30+
basecount = int(existing_name.split('.', 1)[0]) + 1
31+
return f'{basecount:06}'
32+
33+
# saves image named _image_ to outdir/name, writing metadata from prompt
34+
# returns full path of output
35+
def save_image_and_prompt_to_png(self, image, prompt, name):
36+
path = os.path.join(self.outdir, name)
7237
info = PngImagePlugin.PngInfo()
7338
info.add_text('Dream', prompt)
7439
image.save(path, 'PNG', pnginfo=info)
75-
76-
def make_grid(self, image_list, rows=None, cols=None):
77-
image_cnt = len(image_list)
78-
if None in (rows, cols):
79-
rows = floor(sqrt(image_cnt)) # try to make it square
80-
cols = ceil(image_cnt / rows)
81-
width = image_list[0].width
82-
height = image_list[0].height
83-
84-
grid_img = Image.new('RGB', (width * cols, height * rows))
85-
i = 0
86-
for r in range(0, rows):
87-
for c in range(0, cols):
88-
if i>=len(image_list):
89-
break
90-
grid_img.paste(image_list[i], (c * width, r * height))
91-
i = i + 1
92-
93-
return grid_img
40+
return path
9441

9542

9643
class PromptFormatter:

0 commit comments

Comments
 (0)