$5 free credits when you sign up

Simple, Transparent Pricing

Q: Can I really start for free, without a credit card?

Yes. Every new account receives $5 in free credits the moment you sign up — no credit card, no upfront commitment. That balance is enough to test most models extensively, generate hundreds of images, transcribe several hours of video, or prototype a complete AI pipeline. Your account starts on the Basic tier with conservative rate limits designed for testing; making any payment upgrades you to Premium with no waiting period.

Q: How do payments, top-ups, and B2B invoicing work?

deAPI uses Stripe for secure card payments and supported local methods. You can either make one-off top-ups (with preset amounts of $10, $25, or $50) or enable automatic top-ups, which recharge your balance whenever it drops below $2 — perfect for production workloads that can't afford to fail mid-job. For B2B customers needing custom invoices, larger commitments, or tailored billing terms, our team arranges individual agreements; just reach out to support.

Q: What happens when I outgrow the standard limits?

The Basic tier offers conservative rate limits (typically 1–10 RPM depending on the endpoint) ideal for testing. The moment you make any payment via Stripe, your account upgrades to Premium with 300 RPM across all endpoints and unlimited daily requests — instantly, with no application process. For high-volume production needs beyond Premium, dedicated capacity, or enterprise terms, our team can set up bespoke arrangements with volume discounts.

Q: How is transcription priced?

Video and audio transcription is billed per hour of media processed, starting at $0.021 per hour — meaning a 5-minute clip costs roughly $0.0036, a 30-minute meeting recording around $0.012, and a 2-hour film about $0.04. There's no surcharge for high-quality audio, complex accents, or multi-speaker content; the per-hour rate is flat regardless of input difficulty.

Q: Do I get timestamps in the output?

Yes — every transcription request accepts an include_ts flag that adds timestamps to the transcript, ready for direct conversion into subtitle files (SRT, VTT), chapter markers, or clip-anchor links in your app. This makes it straightforward to integrate transcripts with video players, search interfaces, or any downstream tool that needs to navigate by time.

Q: What's the maximum file size or duration I can transcribe?

Direct file uploads support audio up to 20 MB and video up to 50 MB, covering most podcasts and meeting recordings without compression. URL-based inputs (YouTube, Twitch, X/Twitter Spaces, etc.) support media up to 600 minutes (10 hours) of duration, ideal for long-form content like full conference recordings, audiobooks, and entire show seasons. For larger archives, batch your requests through Premium's 300 RPM with webhooks for asynchronous processing.

Pay only for what you use. No subscriptions, no hidden fees.

Get $5 credits Docs

Loading pricing data…

Price Calculator

Video-to-Text Whisper Large V3

Duration 300 s

Estimated cost

Calculating… $0.00361 per transcript

5 min

Charged per audio minute — accepts YouTube, X, TikTok, Twitch, Kick URLs.

Try in Playground

Use case

Duration

Price

Short clip

Social media clips, ads

30s

$0.0022

Meeting recording

Team calls, interviews

30 min

$0.0115

Webinar

Educational content, presentations

60 min

$0.0210

Movie

Full-length films, documentaries

120 min

$0.0401

Free tier available
No credit card required

See Whisper Large V3 in action

Real samples, API docs & free $5 credits to start

Explore

How it works

Three Steps to Your First API Call

Sign Up & Get $5 Free

Create your account in 30 seconds. No credit card required. We'll add $5 in free credits to your balance.

Pick a Model & Call the API

Choose from available open-source models. One unified endpoint, same auth, same format. Test in Playground or hit the REST API.

Pay Only for What You Use

No monthly minimums, no tiers, no lock-in. Charge per request at the rates above. Top up anytime.

Frequently Asked Questions

Everything you need to know about deAPI pricing

Every request is billed dynamically, with the metric chosen to match each task: resolution × steps for images, characters for speech, tokens for embeddings, duration with optional resolution for video, hours for transcription, output characters for OCR, and per-image rates for background removal and upscaling. There are no subscriptions, no monthly minimums, and no hidden fees — you fund a prepaid balance and each successful inference deducts its exact cost. Before any job, you can call the matching /price endpoint to preview the precise cost for the model and parameters you plan to use.

Yes. Every new account receives $5 in free credits the moment you sign up — no credit card, no upfront commitment. That balance is enough to test most models extensively, generate hundreds of images, transcribe several hours of video, or prototype a complete AI pipeline. Your account starts on the Basic tier with conservative rate limits designed for testing; making any payment upgrades you to Premium with no waiting period.

Inference is routed through a globally distributed GPU network rather than concentrated in a few hyperscale data centers, which removes most of the infrastructure markup baked into traditional cloud pricing. We also serve highly optimized open-source models — many of them quantized (INT8, FP8, NF4) and distilled — so each request uses fewer GPU seconds without sacrificing output quality. The combined effect can deliver up to 20× lower inference cost for comparable workloads.

deAPI uses Stripe for secure card payments and supported local methods. You can either make one-off top-ups (with preset amounts of $10, $25, or $50) or enable automatic top-ups, which recharge your balance whenever it drops below $2 — perfect for production workloads that can't afford to fail mid-job. For B2B customers needing custom invoices, larger commitments, or tailored billing terms, our team arranges individual agreements; just reach out to support.

The Basic tier offers conservative rate limits (typically 1–10 RPM depending on the endpoint) ideal for testing. The moment you make any payment via Stripe, your account upgrades to Premium with 300 RPM across all endpoints and unlimited daily requests — instantly, with no application process. For high-volume production needs beyond Premium, dedicated capacity, or enterprise terms, our team can set up bespoke arrangements with volume discounts.

Video and audio transcription is billed per hour of media processed, starting at $0.021 per hour — meaning a 5-minute clip costs roughly $0.0036, a 30-minute meeting recording around $0.012, and a 2-hour film about $0.04. There's no surcharge for high-quality audio, complex accents, or multi-speaker content; the per-hour rate is flat regardless of input difficulty.

Yes — and it's one of deAPI's most cost-saving features. Just paste a URL from YouTube, X (Twitter), Twitch, Kick, TikTok, or X/Twitter Spaces and we handle the download, processing, and transcription server-side. You skip storage costs, bandwidth fees, and the engineering time required to build URL-to-file pipelines yourself. Direct file uploads are also supported (audio: AAC, MP3, OGG, WAV, WebM, FLAC, up to 20 MB; video: MP4, MPEG, MOV, AVI, WMV, OGG, up to 50 MB).

The underlying Whisper Large V3 model offers broad multilingual support with automatic language detection, covering high-resource languages like English, Spanish, French, German, Polish, and Mandarin alongside many lower-resource ones. Word-level accuracy is strong for clean studio audio and remains reliable on noisy real-world content like webinars, podcasts, and meeting recordings. Mixed-language audio is handled gracefully, with the model adapting per segment.

Yes — every transcription request accepts an include_ts flag that adds timestamps to the transcript, ready for direct conversion into subtitle files (SRT, VTT), chapter markers, or clip-anchor links in your app. This makes it straightforward to integrate transcripts with video players, search interfaces, or any downstream tool that needs to navigate by time.

Direct file uploads support audio up to 20 MB and video up to 50 MB, covering most podcasts and meeting recordings without compression. URL-based inputs (YouTube, Twitch, X/Twitter Spaces, etc.) support media up to 600 minutes (10 hours) of duration, ideal for long-form content like full conference recordings, audiobooks, and entire show seasons. For larger archives, batch your requests through Premium's 300 RPM with webhooks for asynchronous processing.

Simple, Transparent Pricing

See Whisper Large V3 in action

Three Steps to Your First API Call

Sign Up & Get $5 Free

Pick a Model & Call the API

Pay Only for What You Use

Frequently Asked Questions

How does deAPI's pay-as-you-go pricing actually work?

Can I really start for free, without a credit card?

Why is deAPI typically more affordable than running models on traditional cloud GPUs?

How do payments, top-ups, and B2B invoicing work?

What happens when I outgrow the standard limits?

How is transcription priced?

Can I really transcribe directly from a YouTube or social media URL?

How accurate is the transcription, and which languages are supported?

Do I get timestamps in the output?

What's the maximum file size or duration I can transcribe?