VACE - BY Wan AI

VACE Introduction Introduction to VACE

All-in-One

VACE is an all-in-one AI model jointly developed by Alibaba, Tongyi Lab, and the Wan team, specifically designed for video creation and editing.

It supports multiple tasks, including:

arrow_right
Reference-to-Video Generation (R2V)
Reference-to-Video Generation
arrow_right
Video-to-Video Editing (V2V)
Video-to-Video Editing
arrow_right
Masked Video-to-Video Editing (MV2V)
Masked Video-to-Video Editing

What makes VACE unique is that users can freely combine these tasks to explore more creative possibilities and simplify workflows.

Powerful Features Capabilities

VACE offers a series of "Anything" features to meet various video creation and editing needs

open_with

Move-Anything

Freely move objects in videos while maintaining natural visual effects

swap_horiz

Swap-Anything

Replace objects in videos while maintaining consistency in motion and context

content_copy

Reference-Anything

Generate videos based on reference images while maintaining style and content consistency

open_in_full

Expand-Anything

Expand video field of view, adding reasonable and coherent additional content

animation

Animate-Anything

Bring static content to life with vivid animation effects, creating engaging videos

Powerful Underlying Technology

VACE utilizes Diffusion Transformer technology to generate and edit high-quality videos while maintaining consistency between temporal and spatial dynamics.

This unified approach simplifies user workflows, reduces the need for multiple separate tools, and improves overall efficiency in the video creation and editing process.

Diffusion Transformer

blur_on

filter_frames

movie

Wan 2.1 Integration Integration with Wan 2.1

VACE's deep integration with Wan 2.1 enhances functionality for specific video editing tasks

tune

Control Workflow

Utilize Wan 2.1 to provide precise video control capabilities, implementing advanced features like pose control

code Control Workflow

text_fields

Prompt-Based Object Replacement

Replace objects in videos through simple text prompts, such as changing a lemon into an apple

code Prompt Word to Replace Object

image

Reference Image Replacement

Use reference images to replace objects in videos, maintaining style consistency and contextual integration

code Replace Objects with Reference Images

settings WanVideo TextEncode

edit "Change lemon to apple"

arrow_downward

movie Video Generation Result

Powerful Workflow Example

In the MimicPC workflow, users can input prompt words in the "WanVideo TextEncode" node to replace objects in videos.

Additionally, enabling the "WanVideo TeaCache" node can accelerate video generation, though it may reduce video quality.

Users can adjust parameters such as width, height, frame rate, and number of frames to customize video resolution and length. Community discussions suggest setting Step=30 for good 2D video effects and Step=50 for clearer real-person facial textures.

Technical Details Technical Details

VACE supports inputs of any resolution, but optimal results are achieved within specific video size ranges

Available Models

Model	Download Link	Video Size	License
VACE-Wan2.1-1.3B-Preview	Hugging Face ModelScope	~ 81 x 480 x 832	Apache-2.0
VACE-LTX-Video-0.9	Hugging Face ModelScope	~ 97 x 512 x 768	RAIL-M
Wan2.1-VACE-1.3B	Coming Soon	~ 81 x 480 x 832	Apache-2.0
Wan2.1-VACE-14B	Coming Soon	~ 81 x 720 x 1080	Apache-2.0

terminal CLI Commands

Perform end-to-end inference using the command-line interface provided by the official GitHub repository:

python vace/vace_pipeline.py --base wan --task depth --video assets/videos/test.mp4 --prompt 'xxx'

Output will be saved to the ./results/ directory

widgets Gradio Demos

Launch interactive Gradio demos using the following commands:

python vace/gradios/preprocess_demo.py

python vace/gradios/vace_wan_demo.py

python vace/gradios/vace_ltx_demo.py

Community Discussions and Feedback

Discussions on community platforms (such as Reddit) highlight VACE's advanced features, like Pose Control and ControlNets, which offer unique advantages compared to other models (like Hunyuan).

User comments like "ControlNets for videos? Awesome!" reflect excitement about its potential for precise video editing.

The community is also looking forward to its open-source release, making comparisons with platforms like Pika Labs, and generally expressing enthusiasm about its potential.

forum

"This looks so cool!"

Reddit User

forum

"ControlNets for videos? Awesome!"

Community Member

forum

"At this rate, I'm gonna be the star of all the classics in a year or 2. $1.99 matinee fee!"

Tech Enthusiast

Related Resources Additional Resources

Explore more VACE-related resources to understand its features and applications

language

Official VACE Page

Provides examples and demonstrations, such as video re-rendering with content, structure, subject, posture, and motion preservation

Visit Official Page arrow_forward

folder_special

Hugging Face Collection

Provides additional models and resources, supporting different application scenarios and tasks

Browse Hugging Face Collection arrow_forward

folder_shared

ModelScope Collection

Provides models and resources in a Chinese environment, suitable for Chinese users

Browse ModelScope Collection arrow_forward

speed

VACE Benchmark

Provides datasets and tools for evaluating video generation and editing quality

Explore Benchmark arrow_forward

edit_note

VACE Annotators

Provides tools for video annotation and data preparation, supporting model training and evaluation

Get Annotation Tools arrow_forward

smart_display

YouTube Tutorials

Community-driven YouTube tutorials demonstrating how to use VACE with tools like ComfyUI

Watch Tutorial Videos arrow_forward

Key Citations

link Hugging Face VACE-LTX-Video-0.9 Model
link ModelScope VACE-Wan2.1-1.3B-Preview Model
link ModelScope VACE-LTX-Video-0.9 Model
link Hugging Face VACE Collection Models
link ModelScope VACE Collection Models
link Hugging Face VACE-Benchmark Dataset
link Hugging Face VACE-Annotators Tools