- $5 free credits when you sign up
Simple, Transparent Pricing
Pay only for what you use. No subscriptions, no hidden fees.
Price Calculator
Estimated cost
Calculating… $0.00361 per transcript
5 min
Use case
Duration
Price
Short clip
Social media clips, ads
30s
$0.0022
Meeting recording
Team calls, interviews
30 min
$0.0115
Webinar
Educational content, presentations
60 min
$0.0210
Movie
Full-length films, documentaries
120 min
$0.0401
- Free tier available
- No credit card required
See Whisper Large V3 in action
Real samples, API docs & free $5 credits to start
ExploreHow it works
Three Steps to Your First API Call
Sign Up & Get $5 Free
Create your account in 30 seconds. No credit card required. We'll add $5 in free credits to your balance.
Pick a Model & Call the API
Choose from available open-source models. One unified endpoint, same auth, same format. Test in Playground or hit the REST API.
Pay Only for What You Use
No monthly minimums, no tiers, no lock-in. Charge per request at the rates above. Top up anytime.
Frequently Asked Questions
Everything you need to know about deAPI pricing
Every request is billed dynamically, with the metric chosen to match each task: resolution × steps for images, characters for speech, tokens for embeddings, duration with optional resolution for video, hours for transcription, output characters for OCR, and per-image rates for background removal and upscaling. There are no subscriptions, no monthly minimums, and no hidden fees — you fund a prepaid balance and each successful inference deducts its exact cost. Before any job, you can call the matching /price endpoint to preview the precise cost for the model and parameters you plan to use.
Yes. Every new account receives $5 in free credits the moment you sign up — no credit card, no upfront commitment. That balance is enough to test most models extensively, generate hundreds of images, transcribe several hours of video, or prototype a complete AI pipeline. Your account starts on the Basic tier with conservative rate limits designed for testing; making any payment upgrades you to Premium with no waiting period.
Inference is routed through a globally distributed GPU network rather than concentrated in a few hyperscale data centers, which removes most of the infrastructure markup baked into traditional cloud pricing. We also serve highly optimized open-source models — many of them quantized (INT8, FP8, NF4) and distilled — so each request uses fewer GPU seconds without sacrificing output quality. The combined effect can deliver up to 20× lower inference cost for comparable workloads.
deAPI uses Stripe for secure card payments and supported local methods. You can either make one-off top-ups (with preset amounts of $10, $25, or $50) or enable automatic top-ups, which recharge your balance whenever it drops below $2 — perfect for production workloads that can't afford to fail mid-job. For B2B customers needing custom invoices, larger commitments, or tailored billing terms, our team arranges individual agreements; just reach out to support.
The Basic tier offers conservative rate limits (typically 1–10 RPM depending on the endpoint) ideal for testing. The moment you make any payment via Stripe, your account upgrades to Premium with 300 RPM across all endpoints and unlimited daily requests — instantly, with no application process. For high-volume production needs beyond Premium, dedicated capacity, or enterprise terms, our team can set up bespoke arrangements with volume discounts.
Video and audio transcription is billed per hour of media processed, starting at $0.021 per hour — meaning a 5-minute clip costs roughly $0.0036, a 30-minute meeting recording around $0.012, and a 2-hour film about $0.04. There's no surcharge for high-quality audio, complex accents, or multi-speaker content; the per-hour rate is flat regardless of input difficulty.
Yes — and it's one of deAPI's most cost-saving features. Just paste a URL from YouTube, X (Twitter), Twitch, Kick, TikTok, or X/Twitter Spaces and we handle the download, processing, and transcription server-side. You skip storage costs, bandwidth fees, and the engineering time required to build URL-to-file pipelines yourself. Direct file uploads are also supported (audio: AAC, MP3, OGG, WAV, WebM, FLAC, up to 20 MB; video: MP4, MPEG, MOV, AVI, WMV, OGG, up to 50 MB).
The underlying Whisper Large V3 model offers broad multilingual support with automatic language detection, covering high-resource languages like English, Spanish, French, German, Polish, and Mandarin alongside many lower-resource ones. Word-level accuracy is strong for clean studio audio and remains reliable on noisy real-world content like webinars, podcasts, and meeting recordings. Mixed-language audio is handled gracefully, with the model adapting per segment.
Yes — every transcription request accepts an include_ts flag that adds timestamps to the transcript, ready for direct conversion into subtitle files (SRT, VTT), chapter markers, or clip-anchor links in your app. This makes it straightforward to integrate transcripts with video players, search interfaces, or any downstream tool that needs to navigate by time.
Direct file uploads support audio up to 20 MB and video up to 50 MB, covering most podcasts and meeting recordings without compression. URL-based inputs (YouTube, Twitch, X/Twitter Spaces, etc.) support media up to 600 minutes (10 hours) of duration, ideal for long-form content like full conference recordings, audiobooks, and entire show seasons. For larger archives, batch your requests through Premium's 300 RPM with webhooks for asynchronous processing.