Image-to-Video
Diffusers
Safetensors
StableVideoDiffusionPipeline
normal-estimation
video
diffusion
svd
Instructions to use AEmotionStudio/NormalCrafter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use AEmotionStudio/NormalCrafter with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("AEmotionStudio/NormalCrafter", dtype=torch.bfloat16, device_map="cuda") pipe.to("cuda") prompt = "A man with short gray hair plays a red electric guitar." image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png" ) output = pipe(image=image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| library_name: diffusers | |
| pipeline_tag: image-to-video | |
| tags: | |
| - normal-estimation | |
| - video | |
| - diffusion | |
| - svd | |
| # NormalCrafter — Video Normal Map Estimation | |
| Mirror of [Yanrui95/NormalCrafter](https://huggingface.co/Yanrui95/NormalCrafter) hosted by [AEmotionStudio](https://huggingface.co/AEmotionStudio) for use with [ComfyUI-FFMPEGA](https://github.com/AEmotionStudio/ComfyUI-FFMPEGA). | |
| ## Model Description | |
| NormalCrafter generates **temporally consistent surface normal maps** from video using a Stable Video Diffusion (SVD) backbone fine-tuned for normal estimation. Unlike image-based methods (e.g., Marigold), NormalCrafter operates natively on video sequences, producing smooth frame-to-frame normals without flickering. | |
| ## Key Features | |
| - **Video-native**: Processes temporal sequences for coherent normals across frames | |
| - **SVD backbone**: Built on `stabilityai/stable-video-diffusion-img2vid-xt` | |
| - **High resolution**: Supports up to 1024px inference | |
| - **Apache-2.0 Licensed**: Free for commercial and personal use | |
| ## Model Files | |
| | File | Size | Description | | |
| |------|------|-------------| | |
| | `unet/diffusion_pytorch_model.safetensors` | 3.05 GB | Fine-tuned UNet for normal estimation | | |
| | `image_encoder/model.fp16.safetensors` | 1.26 GB | CLIP image encoder (fp16) | | |
| | `vae/diffusion_pytorch_model.safetensors` | 196 MB | VAE decoder | | |
| ## Usage in ComfyUI-FFMPEGA | |
| NormalCrafter is available as: | |
| - **Standalone skill**: `normalcrafter` in the FFMPEGA agent | |
| - **No-LLM mode**: Select `normalcrafter` in the agent node dropdown | |
| - **AI Relighting**: Enable "Use NormalCrafter" in the Video Editor's Relight panel for physically-based relighting | |
| ## Citation | |
| ```bibtex | |
| @article{normalcrafter2024, | |
| title={NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors}, | |
| author={Yanrui Bin and Wenbo Hu and Haoyuan Wang and Xinya Chen and Bing Wang}, | |
| year={2024} | |
| } | |
| ``` | |
| ## License | |
| - **Model weights** (this repo): **Apache-2.0** — matching the upstream [Yanrui95/NormalCrafter](https://huggingface.co/Yanrui95/NormalCrafter) HuggingFace repo. See [LICENSE](LICENSE). | |
| - **Source code**: **MIT** — as published at [Binyr/NormalCrafter](https://github.com/Binyr/NormalCrafter) on GitHub. | |
| Both licenses are permissive and allow commercial use. | |
| ## Links | |
| - **Paper**: [NormalCrafter](https://github.com/Binyr/NormalCrafter) | |
| - **Upstream weights**: [Yanrui95/NormalCrafter](https://huggingface.co/Yanrui95/NormalCrafter) | |
| - **ComfyUI-FFMPEGA**: [AEmotionStudio/ComfyUI-FFMPEGA](https://github.com/AEmotionStudio/ComfyUI-FFMPEGA) | |