
Wan 2.7 Image is the standard text-to-image model with faster generation speed. Supports up to 2K resolution, thinking mode, sequential generation, and custom color themes. Does not support 4K.

Wan 2.7 Image Pro is the professional text-to-image model supporting up to 4K resolution, thinking mode for enhanced reasoning, sequential multi-image generation, and custom color themes. Supports Chinese and English prompts up to 5000 characters.

Wan 2.7 Image Edit is the standard image editing model with faster generation speed. Supports multi-image reference, interactive editing, sequential generation, and custom color themes. Max 2K output. Does not support 4K.

Wan 2.7 Image Pro Edit is the professional image editing model supporting multi-image reference generation, interactive bounding-box editing, sequential multi-image generation, and custom color themes. Supports 1-9 input images, max 2K output.

Wan 2.7 Reference-to-Video model generates videos from reference media (images and videos) with prompts. Requires at least one media parameter (reference_images, reference_videos, or first_frame_image). Supports multi-subject references, storyboard generation, and custom audio for voice cloning.

Wan 2.7 Video Editing model supports video style modification and video editing with multi-modal inputs (text/image/video). Processing time: 1-5 minutes.

Wan 2.7 text-to-video model generates high-quality videos from text prompts using the new protocol (resolution+ratio instead of size). Supports multi-shot narrative, automatic dubbing, custom audio, 720P/1080P resolutions, and 2–15 second durations.

Wan 2.7 image-to-video model supports three task modes with flattened media parameters: first-frame generation, first-last-frame generation, and continuation. Parameter combination must conform to allowed mode combinations.