AI Model Family

alibaba/wan2.7-image

alibaba/wan2.7-image

text-to-image

Wan 2.7 Image is the standard text-to-image model with faster generation speed. Supports up to 2K resolution, thinking mode, sequential generation, and custom color themes. Does not support 4K.

$0.0260/img
alibaba/wan2.7-image-pro

alibaba/wan2.7-image-pro

text-to-image

Wan 2.7 Image Pro is the professional text-to-image model supporting up to 4K resolution, thinking mode for enhanced reasoning, sequential multi-image generation, and custom color themes. Supports Chinese and English prompts up to 5000 characters.

$0.0700/img
alibaba/wan2.7-image-edit

alibaba/wan2.7-image-edit

image-to-image

Wan 2.7 Image Edit is the standard image editing model with faster generation speed. Supports multi-image reference, interactive editing, sequential generation, and custom color themes. Max 2K output. Does not support 4K.

$0.0260/img
alibaba/wan2.7-image-pro-edit

alibaba/wan2.7-image-pro-edit

image-to-image

Wan 2.7 Image Pro Edit is the professional image editing model supporting multi-image reference generation, interactive bounding-box editing, sequential multi-image generation, and custom color themes. Supports 1-9 input images, max 2K output.

$0.0700/img
alibaba/wan2.7-r2v

alibaba/wan2.7-r2v

video-to-video

Wan 2.7 Reference-to-Video model generates videos from reference media (images and videos) with prompts. Requires at least one media parameter (reference_images, reference_videos, or first_frame_image). Supports multi-subject references, storyboard generation, and custom audio for voice cloning.

$0.1000~$0.1400/sec
alibaba/wan2.7-videoedit

alibaba/wan2.7-videoedit

video-to-video

Wan 2.7 Video Editing model supports video style modification and video editing with multi-modal inputs (text/image/video). Processing time: 1-5 minutes.

$0.1000~$0.1400/sec
alibaba/wan2.7-t2v

alibaba/wan2.7-t2v

text-to-video

Wan 2.7 text-to-video model generates high-quality videos from text prompts using the new protocol (resolution+ratio instead of size). Supports multi-shot narrative, automatic dubbing, custom audio, 720P/1080P resolutions, and 2–15 second durations.

$0.0650~$0.0950/sec
alibaba/wan2.7-i2v

alibaba/wan2.7-i2v

image-to-video

Wan 2.7 image-to-video model supports three task modes with flattened media parameters: first-frame generation, first-last-frame generation, and continuation. Parameter combination must conform to allowed mode combinations.

$0.0650~$0.0950/sec