Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
We've created this guide to help creators better "tame" the Kling AI model through effective prompts and write better "spells" for Kling. Of course, as the model continues to iterate and our understanding deepens, this guide will be regularly updated.
We believe in collaborative improvement and invite all creators to join us in refining these guidelines. If you discover new tips or tricks that work well with the model, please share them with us. Together, we can develop an even better guide for mastering Kling AI video generator and help everyone create higher quality content.
Kling AI is a video generation AI model developed by Kuaishou's AI team. It supports multiple capabilities including text-to-video, image-to-video, video extension, camera movement control, and first-last frame control, enabling users to create artistic videos easily and efficiently.
Kling AI can generate 5 or 10-second videos from text descriptions. It offers two generation modes:
The model supports three aspect ratios: 16:9, 9:16, and 1:1 to meet diverse video creation needs.
undefined
undefined
Prompts are the primary language for communicating with text-to-video AI models like Kling, directly determining what kind of video content will be generated. As a next-generation (2.0) AI video model, Kling is constantly evolving through updates and iterations. To maximize its potential and create better AI videos, we need to continually explore and master effective prompt writing techniques.
Kling AI uses the following formula:
Prompt = Subject (Subject Description) + Movement of Subject + Scene (Scene Description) + (Camera Language + Lighting + Atmosphere)
Note: Elements in parentheses are optional
Component | Description | Examples/Details |
---|---|---|
Subject | The main object in the video, representing the core theme | |
Subject Description(optional) | Detailed description of the subject's features and posture, which can include short phrases | |
Movement of Subject | Description of the subject's movement state, including both static and dynamic | |
Scene | The environment where the subject is located, including foreground and background | |
Scene Description(optional) | Detailed description of the environment, not too much | |
Camera Language(optional) | The use of various camera techniques and transitions between shots to convey stories or information, while creating specific visual effects and emotional atmospheres.(Note: This is different from camera movement control) | |
Lighting(optional) | Light and shadow are the key elements that breathe soul into photographic works. The use of light and shadow can add depth and emotional resonance to photos, allowing us to create pieces with rich layers and powerful emotional expression. | |
Atmosphere(optional) | Describes the intended mood of the video |
The core components of the formula are Subject, Movement, and Scene, which form the basic structure of a video. To add more detail, you can list multiple descriptive phrases for the subject and scene. Kling video model will expand on these prompts to generate the video as expected.
For example:
The purpose of the formula is to help describe the desired video. Feel free to unleash your creativity and interact with Kling without being limited by the formula—you might get some amazing results!
By inputting an image, Kling’s model generates a 5-second or 10-second video based on the image, turning it into a dynamic video scene. If you also provide a text description along with the image, Kling will use the text to create a video. It supports both "Standard" and "High Quality" generation modes, as well as three aspect ratios: 16:9, 9:16, and 1:1, offering more flexibility for video creation.
Image-to-video is currently the most frequently used feature among creators because:
The feature has enabled creative trends like:
For image-to-video generation, controlling subject movement is key. Here's the recommended formula:
Prompt = Subject + Movement, Background + Movement
Component | Description | Notes |
---|---|---|
Subject | People, animals, or objects in the image | Main focus of animation |
Movement | Desired movement trajectory of the subject | Should be physically possible |
Background | Background elements in the image | Optional movement description |
The core elements are Subject and Movement. Unlike text-to-video, the scene already exists in the image, so you only need to describe:
For example, consider wanting "Mona Lisa to wear sunglasses":
Note: When the model recognizes an image as a painting, it might generate a gallery-style video tour. This is why photos with frames often result in static videos (avoid uploading framed images).
The formula serves as a guide to improve video generation success rates, but creativity comes from exploration. Feel free to experiment and communicate boldly with Kling!
Extend your AI-generated videos by 4-5 seconds, with support for multiple extensions (up to 3 minutes total length). Fine-tune the results using customized prompts.
The video extension feature can be found in the lower-left tab after video generation and offers two modes:
1. Automatic Extension
2. Custom Creative Extension
To ensure the best results with custom extensions, use this formula:
Prompt = Subject + Movement
Component | Description | Notes |
---|---|---|
Subject | The main element you want to animate from the uploaded image | - Select a single subject for best results - Choose clear, identifiable elements - Focus on prominent objects/characters in the image |
Movement | The desired movement trajectory for your target subject | - Define clear directional movement - Keep movements simple and realistic - Specify the exact type of motion desired |
Supported Versions:
Both versions support six fundamental camera motions: Horizontal, Vertical, Zoom, Pan, Tilt, Roll.
Kling 1.0 (Text-to-Video) exclusively offers four master-level camera movements: Move left and zoom in, Move right and zoom in, Move forward and zoom up, Move down and zoom out.
Camera movement control is an essential element of cinematographic language. To enhance video creation versatility and provide better control over camera movements, the platform has implemented:
The start-end frame feature allows users to upload two images that serve as the first and last frames of a generated video. Users can access this feature by clicking "Add End Frame" in the top-right corner of the "Image-to-Video" function.
This feature enables more precise control over video generation, particularly useful for creators who need specific control over their video's opening and closing frames. The system effectively creates dynamic transitions between the specified frames. However, it's important to note that the content of both start and end frames should be relatively similar - significant differences between frames may result in an unintended scene transition.
This feature significantly enhances creative control in AI video generation, allowing creators to precisely define their video's beginning and end points while letting the AI handle the intermediate transition frames effectively.
The Motion Brush function lets you upload any image and select a specific area or object in it using either “auto-selection” or “smear” tools. You can then add a motion path and type in a simple prompt (like “subject + motion,” e.g., “dog running on the road”).
After clicking “Generate,” the model creates a video showing the selected object moving as specified. This gives you more control over how things move in advanced image-to-video generation.
It’s a powerful tool for creating custom movements—like tricky stuff such as “ball sports” or “people/animals turning and walking along a path”—that are hard to pull off in regular image-to-video tools.
You can set up to 6 objects and their motion paths at once. Plus, there’s a new “Static Brush” feature: smear it on an area, and the model locks those pixels in place, preventing any camera movement. If you don’t want the motion path to accidentally shift the camera, smear the Static Brush at the bottom of the image.
The Motion Brush gives you next-level control over image-to-video generation. Whether it’s making a ball roll, a person turn, or an animal stroll along a custom route, it’s got you covered.
Kling AI is a powerful video generation model developed by Kuaishou that transforms text and images into high-quality videos. The platform offers multiple key features including text-to-video generation, image animation, video extension, camera movement control, and first-last frame control. Users can create videos in different aspect ratios (16:9, 9:16, and 1:1) with both standard and high-quality generation modes available.
This comprehensive toolkit makes Kling AI an efficient solution for content creators, marketers, and artists, enabling them to produce professional-quality videos without requiring extensive technical expertise. It can significantly enhance efficiency in creating short films, commercial advertisements, music videos, TikTok short videos, and YouTube videos.