3-Step Pipeline
Chain in one workflow
Chain text-to-image + text-to-speech + audio-to-video through three API calls. A complete talking avatar — portrait, voice, and synced animation — generated from a single text description.
Build talking AI avatars by chaining text-to-image (FLUX-2 Klein), text-to-speech (Kokoro, Chatterbox), and audio-to-video (LTX-2.3) into one pipeline. Full avatar from ~$0.04, powered by deAPI's decentralized GPUs at low cost.
deAPI's avatar pipeline chains three open-source models — FLUX-2 Klein, Kokoro / Chatterbox TTS, and LTX-2.3 audio-to-video — behind one unified API. With decentralized GPU infrastructure, deAPI delivers full talking-head avatars from ~$0.04 per generation, up to 20× lower than HeyGen / Synthesia-class SaaS. Whether you're building marketing automation, e-learning platforms, customer support flows, or faceless content channels, deAPI makes it simple to ship avatar video at scale. Check the full list.
Chain in one workflow
Chain text-to-image + text-to-speech + audio-to-video through three API calls. A complete talking avatar — portrait, voice, and synced animation — generated from a single text description.
State-of-the-art video model
State-of-the-art image-to-video model by Lightricks. Natural head movements, blinking, and facial expressions from a single portrait — driven by the generated speech in audio-to-video mode.
~$0.04 per avatar
Full avatar from ~$0.04 per generation. Decentralized GPUs make talking-head video affordable at any scale — over 120 avatars on the $5 starter credit.
No vendor lock-in
No vendor lock-in. FLUX, LTX-2.3, Kokoro, Chatterbox — swap models anytime as better ones emerge. One key, one billing account, every modality of the pipeline.
Chain three API calls to ship a full avatar — portrait, voice, animation
Create a photorealistic or stylized portrait from a text description. Define gender, age, ethnicity, clothing, background — everything through a prompt. FLUX-2 Klein delivers high-quality faces in seconds.
Single POST to /txt2img with your prompt. Receive a download URL with the generated portrait. Use prompt enhancement for optimized results automatically.
Fast, high-quality photorealistic portraits
from $0.00141/img
Alternative model for stylized portraits
from $0.00248/img
Optimize prompts for better face generation
Generate natural-sounding speech from any text. Choose from multiple voices or clone a custom voice with Chatterbox. The generated audio file will be used in the next step to drive the avatar's animation.
POST to /txt2audio with text content and voice parameters. Receive an audio file URL. This audio will feed directly into LTX-2.3's audio-to-video endpoint.
Fast, natural English voice generation
Clone any voice from a short audio sample
Multilingual speech for global content
Combine the portrait and the generated audio in one step. LTX-2.3's audio-to-video mode takes an image and an audio file, then produces a video with lip-synced animation, natural head movements, and facial expressions.
POST to /aud2video with the portrait URL, the generated audio URL, and a motion prompt. Receive a complete talking avatar video — audio and animation combined.
Lip-synced animation driven by speech
from $0.0396/video
Generic image-to-video for background motion
Optimize motion prompts for smoother results
Generate personalized video messages at scale. Create product demos, explainer videos, and social media content with AI presenters — without hiring actors or booking studios.
Build course videos with AI instructors. Translate training materials into any language with localized avatars. Update content instantly without re-recording.
Product walkthroughs and onboarding videos at scale
Multilingual FAQ avatars and support flows
Faceless YouTube channels, news avatars, faceless creators
Watch how deAPI chains text-to-image, text-to-speech, and audio-to-video into a single talking-avatar pipeline. From API call to lip-synced video in under a minute.
Chain three API calls and ship a talking-head video — no actors, no studio
Everything you need to know