With Stable Diffusion, Stability AI adds animation capabilities to its image models. The video tool is currently only available through a paid API.
Stability AI has announced a development kit for Stable Animation, a new way to create moving images. The model takes input in three different ways:
- Classic text prompt, as in Stable Diffusion, Midjourney, or DALL-E 2.
- with a text prompt and an image as a starting point for the animation
- with text prompt and video
The software seems to be still in the experimental stage. Instead of offering it through one of their browser platforms like DreamStudio or ClipDrop, Stability AI only offers a software development kit (SDK) and a paid API. Of course, this does not prevent third parties from offering the animation model through their service.
Python scripting required
Since the videos have to be generated by a Python script, the handling is rather complicated. Similar to the image model, numerous parameters can be set, such as steps, sampler, scale, or seed. In addition, features like outpainting or prompt interpolation are available.
Different parameters also affect the price. This is why there is no general answer to the question of how much it costs to create a video. Stability quotes range from 3 to 18 US cents per 100 frames in different settings.
Stable Animation is compatible with Stable Diffusion XL
Stable Animation can be combined with every version of Stable Diffusion. The default version is 1.5, but you can also choose to use the new and improved Stable Diffusion XL. There are also many style presets available, from anime to comic book, low poly to pixel art.
The resolution is 512 x 512 pixels without adjustment and can be increased to 1,024 x 1,024 pixels. Additionally, an upscaler can be used. The default is 72 frames at 12 frames per second, but according to the documentation, it can be increased to infinity.
There are already animation tools based on Stable Diffusion that can, for example, generate a short sequence of moving images by prompt interpolation, ie by continuously changing certain properties of the input. But judging from Stability AI’s demonstration, Stable Animation promises to be a much more comprehensive and mature solution.
While it won’t be able to produce motion pictures at the touch of a button anytime soon, projects like Stable Animation, along with the progress of Runway ML or models like Phenaki and Imagen Video, show where visual generative AI is headed in the near future, moving from still images to GIF-like animations.