ModelTC / LightX2V
- пятница, 26 декабря 2025 г. в 00:00:02
Light Video Generation Inference Framework
LightX2V is an advanced lightweight video generation inference framework engineered to deliver efficient, high-performance video synthesis solutions. This unified platform integrates multiple state-of-the-art video generation techniques, supporting diverse generation tasks including text-to-video (T2V) and image-to-video (I2V). X2V represents the transformation of different input modalities (X, such as text or images) into video output (V).
🌐 Try it online now! Experience LightX2V without installation: LightX2V Online Service - Free, lightweight, and fast AI digital human video generation platform.
👋 Join us on WeChat.
December 25, 2025: 🚀 Supported deployment on AMD ROCm and Ascend 910B.
December 23, 2025: 🚀 We support the Qwen-Image-Edit-2511 image editing model since Day 0. On a single H100 GPU, LightX2V delivers approximately 1.4× speedup. We support for CFG parallelism, Ulysses parallelism, and efficient offloading technologies. Our HuggingFace has been updated with CFG / step-distilled LoRA and FP8 weights. Usage examples can be found in the Python scripts. Combined with LightX2V, 4-step CFG / step distillation, and the FP8 model, the maximum acceleration can reach up to approximately 42×. Feel free to try LightX2V Online Service with Image to Image and Qwen-Image-Edit-2511 model.
December 22, 2025: 🚀 Added Wan2.1 NVFP4 quantization-aware 4-step distilled models; weights are available on HuggingFace: Wan-NVFP4.
December 15, 2025: 🚀 Supported deployment on Hygon DCU.
December 4, 2025: 🚀 Supported GGUF format model inference & deployment on Cambricon MLU590/MetaX C500.
November 24, 2025: 🚀 We released 4-step distilled models for HunyuanVideo-1.5! These models enable ultra-fast 4-step inference without CFG requirements, achieving approximately 25x speedup compared to standard 50-step inference. Both base and FP8 quantized versions are now available: Hy1.5-Distill-Models.
November 21, 2025: 🚀 We support the HunyuanVideo-1.5 video generation model since Day 0. With the same number of GPUs, LightX2V can achieve a speed improvement of over 2 times and supports deployment on GPUs with lower memory (such as the 24GB RTX 4090). It also supports CFG/Ulysses parallelism, efficient offloading, TeaCache/MagCache technologies, and more. We will soon update more models on our HuggingFace page, including step distillation, VAE distillation, and other related models. Quantized models and lightweight VAE models are now available: Hy1.5-Quantized-Models for quantized inference, and LightTAE for HunyuanVideo-1.5 for fast VAE decoding. Refer to this for usage tutorials, or check out the examples directory for code examples.
| Framework | GPUs | Step Time | Speedup |
|---|---|---|---|
| Diffusers | 1 | 9.77s/it | 1x |
| xDiT | 1 | 8.93s/it | 1.1x |
| FastVideo | 1 | 7.35s/it | 1.3x |
| SGL-Diffusion | 1 | 6.13s/it | 1.6x |
| LightX2V | 1 | 5.18s/it | 1.9x 🚀 |
| FastVideo | 8 | 2.94s/it | 1x |
| xDiT | 8 | 2.70s/it | 1.1x |
| SGL-Diffusion | 8 | 1.19s/it | 2.5x |
| LightX2V | 8 | 0.75s/it | 3.9x 🚀 |
| Framework | GPUs | Step Time | Speedup |
|---|---|---|---|
| Diffusers | 1 | 30.50s/it | 1x |
| FastVideo | 1 | 22.66s/it | 1.3x |
| xDiT | 1 | OOM | OOM |
| SGL-Diffusion | 1 | OOM | OOM |
| LightX2V | 1 | 20.26s/it | 1.5x 🚀 |
| FastVideo | 8 | 15.48s/it | 1x |
| xDiT | 8 | OOM | OOM |
| SGL-Diffusion | 8 | OOM | OOM |
| LightX2V | 8 | 4.75s/it | 3.3x 🚀 |
| Framework | GPU | Configuration | Step Time | Speedup |
|---|---|---|---|---|
| LightX2V | H100 | 8 GPUs + cfg | 0.75s/it | 1x |
| LightX2V | H100 | 8 GPUs + no cfg | 0.39s/it | 1.9x |
| LightX2V | H100 | 8 GPUs + no cfg + fp8 | 0.35s/it | 2.1x 🚀 |
| LightX2V | 4090D | 8 GPUs + cfg | 4.75s/it | 1x |
| LightX2V | 4090D | 8 GPUs + no cfg | 3.13s/it | 1.5x |
| LightX2V | 4090D | 8 GPUs + no cfg + fp8 | 2.35s/it | 2.0x 🚀 |
Note: All the above performance data were tested on Wan2.1-I2V-14B-480P(40 steps, 81 frames). In addition, we also provide 4-step distilled models on the HuggingFace page.
For comprehensive usage instructions, please refer to our documentation: English Docs | 中文文档
We highly recommend using the Docker environment, as it is the simplest and fastest way to set up the environment. For details, please refer to the Quick Start section in the documentation.
pip install -v git+https://github.com/ModelTC/LightX2V.gitgit clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
uv pip install -v . # pip install -v .For attention operators installation, please refer to our documentation: English Docs | 中文文档
# examples/wan/wan_i2v.py
"""
Wan2.2 image-to-video generation example.
This example demonstrates how to use LightX2V with Wan2.2 model for I2V generation.
"""
from lightx2v import LightX2VPipeline
# Initialize pipeline for Wan2.2 I2V task
# For wan2.1, use model_cls="wan2.1"
pipe = LightX2VPipeline(
model_path="/path/to/Wan2.2-I2V-A14B",
model_cls="wan2.2_moe",
task="i2v",
)
# Alternative: create generator from config JSON file
# pipe.create_generator(
# config_json="configs/wan22/wan_moe_i2v.json"
# )
# Enable offloading to significantly reduce VRAM usage with minimal speed impact
# Suitable for RTX 30/40/50 consumer GPUs
pipe.enable_offload(
cpu_offload=True,
offload_granularity="block", # For Wan models, supports both "block" and "phase"
text_encoder_offload=True,
image_encoder_offload=False,
vae_offload=False,
)
# Create generator manually with specified parameters
pipe.create_generator(
attn_mode="sage_attn2",
infer_steps=40,
height=480, # Can be set to 720 for higher resolution
width=832, # Can be set to 1280 for higher resolution
num_frames=81,
guidance_scale=[3.5, 3.5], # For wan2.1, guidance_scale is a scalar (e.g., 5.0)
sample_shift=5.0,
)
# Generation parameters
seed = 42
prompt = "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
negative_prompt = "镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走"
image_path="/path/to/img_0.jpg"
save_result_path = "/path/to/save_results/output.mp4"
# Generate video
pipe.generate(
seed=seed,
image_path=image_path,
prompt=prompt,
negative_prompt=negative_prompt,
save_result_path=save_result_path,
)NVFP4 (quantization-aware 4-step) resources
examples/wan/wan_i2v_nvfp4.py (I2V) and examples/wan/wan_t2v_nvfp4.py (T2V).lightx2v_kernel/README.md.💡 More Examples: For more usage examples including quantization, offloading, caching, and other advanced configurations, please refer to the examples directory.
🔔 Follow our HuggingFace page for the latest model releases from our team.
💡 Refer to the Model Structure Documentation to quickly get started with LightX2V
We provide multiple frontend interface deployment options:
💡 Recommended Solutions:
w8a8-int8, w8a8-fp8, w4a4-nvfp4 and other quantization strategiesWe maintain code quality through automated pre-commit hooks to ensure consistent formatting across the project.
Tip
Setup Instructions:
pip install ruff pre-commitpre-commit run --all-filesWe appreciate your contributions to making LightX2V better!
We extend our gratitude to all the model repositories and research communities that inspired and contributed to the development of LightX2V. This framework builds upon the collective efforts of the open-source community.
If you find LightX2V useful in your research, please consider citing our work:
@misc{lightx2v,
author = {LightX2V Contributors},
title = {LightX2V: Light Video Generation Inference Framework},
year = {2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/ModelTC/lightx2v}},
}For questions, suggestions, or support, please feel free to reach out through: