Commit
·
ed45558
1
Parent(s):
942ef5e
Update README.md
Browse files
README.md
CHANGED
|
@@ -12,14 +12,14 @@ tags:
|
|
| 12 |
|
| 13 |
# Stable Diffusion XL 1.0 TensorRT
|
| 14 |
|
| 15 |
-
|
| 16 |
|
| 17 |
This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** created in collaboration with [NVIDIA](https://huggingface.co/nvidia). The optimized versions give substantial improvements in speed and efficiency.
|
| 18 |
|
| 19 |
|
| 20 |

|
| 21 |
|
| 22 |
-
|
| 23 |
|
| 24 |
- **Developed by:** Stability AI
|
| 25 |
- **Model type:** Diffusion-based text-to-image generative model
|
|
@@ -27,7 +27,7 @@ This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** creat
|
|
| 27 |
- **Model Description:** This is a conversion of the [SDXL base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and [SDXL refiner 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0) models for [NVIDIA TensorRT](https://developer.nvidia.com/tensorrt) optimized inference
|
| 28 |
|
| 29 |
|
| 30 |
-
|
| 31 |
|
| 32 |
#### Timings for 30 steps at 1024x1024
|
| 33 |
|
|
@@ -37,7 +37,7 @@ This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** creat
|
|
| 37 |
| A100 | 3704 ms | 2742 ms | ~26% |
|
| 38 |
| H100 | 2496 ms | 1471 ms | ~41% |
|
| 39 |
|
| 40 |
-
#### Image throughput for 30 steps
|
| 41 |
|
| 42 |
| Accelerator | Baseline (non-optimized) | NVIDIA TensorRT (optimized) | Percentage improvement |
|
| 43 |
|-------------|--------------------------|-----------------------------|------------------------|
|
|
@@ -46,7 +46,7 @@ This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** creat
|
|
| 46 |
| H100 | 0.40 images/sec | 0.68 images/sec | ~70% |
|
| 47 |
|
| 48 |
|
| 49 |
-
|
| 50 |
|
| 51 |
1. Following the [setup instructions](https://github.com/rajeevsrao/TensorRT/blob/release/8.6/demo/Diffusion/README.md) for TensorRT on launching a TensorRT NGC container.
|
| 52 |
```shell
|
|
@@ -70,8 +70,7 @@ cd ..
|
|
| 70 |
python3 -m pip install --upgrade pip
|
| 71 |
python3 -m pip install --upgrade tensorrt
|
| 72 |
|
| 73 |
-
|
| 74 |
-
cd $TRT_OSSPATH/demo/Diffusion
|
| 75 |
pip3 install -r requirements.txt
|
| 76 |
```
|
| 77 |
|
|
@@ -79,7 +78,6 @@ pip3 install -r requirements.txt
|
|
| 79 |
```
|
| 80 |
python3 demo_txt2img_xl.py \
|
| 81 |
"Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" \
|
| 82 |
-
--hf-token=<Your HF TOKEN> \
|
| 83 |
--build-static-batch \
|
| 84 |
--use-cuda-graph \
|
| 85 |
--num-warmup-runs 1 \
|
|
|
|
| 12 |
|
| 13 |
# Stable Diffusion XL 1.0 TensorRT
|
| 14 |
|
| 15 |
+
## Introduction
|
| 16 |
|
| 17 |
This repository hosts the TensorRT versions of **Stable Diffusion XL 1.0** created in collaboration with [NVIDIA](https://huggingface.co/nvidia). The optimized versions give substantial improvements in speed and efficiency.
|
| 18 |
|
| 19 |
|
| 20 |

|
| 21 |
|
| 22 |
+
## Model Description
|
| 23 |
|
| 24 |
- **Developed by:** Stability AI
|
| 25 |
- **Model type:** Diffusion-based text-to-image generative model
|
|
|
|
| 27 |
- **Model Description:** This is a conversion of the [SDXL base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and [SDXL refiner 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0) models for [NVIDIA TensorRT](https://developer.nvidia.com/tensorrt) optimized inference
|
| 28 |
|
| 29 |
|
| 30 |
+
## Performance Comparison
|
| 31 |
|
| 32 |
#### Timings for 30 steps at 1024x1024
|
| 33 |
|
|
|
|
| 37 |
| A100 | 3704 ms | 2742 ms | ~26% |
|
| 38 |
| H100 | 2496 ms | 1471 ms | ~41% |
|
| 39 |
|
| 40 |
+
#### Image throughput for 30 steps at 1024x1024
|
| 41 |
|
| 42 |
| Accelerator | Baseline (non-optimized) | NVIDIA TensorRT (optimized) | Percentage improvement |
|
| 43 |
|-------------|--------------------------|-----------------------------|------------------------|
|
|
|
|
| 46 |
| H100 | 0.40 images/sec | 0.68 images/sec | ~70% |
|
| 47 |
|
| 48 |
|
| 49 |
+
## Usage Example
|
| 50 |
|
| 51 |
1. Following the [setup instructions](https://github.com/rajeevsrao/TensorRT/blob/release/8.6/demo/Diffusion/README.md) for TensorRT on launching a TensorRT NGC container.
|
| 52 |
```shell
|
|
|
|
| 70 |
python3 -m pip install --upgrade pip
|
| 71 |
python3 -m pip install --upgrade tensorrt
|
| 72 |
|
| 73 |
+
cd demo/Diffusion
|
|
|
|
| 74 |
pip3 install -r requirements.txt
|
| 75 |
```
|
| 76 |
|
|
|
|
| 78 |
```
|
| 79 |
python3 demo_txt2img_xl.py \
|
| 80 |
"Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" \
|
|
|
|
| 81 |
--build-static-batch \
|
| 82 |
--use-cuda-graph \
|
| 83 |
--num-warmup-runs 1 \
|