WhisperDirect

高精度语音转文本摘要，成本低且可免费试用

产品功能

应用市场介绍

WhisperDirect is a high-accuracy speech-to-text and summarization app that works with your own API key.
No subscription required — you only pay OpenAI’s usage fees when you need it, making it more cost-effective than subscription-based apps.
Pricing & Trial
• Free trial: 5 sessions included
• After the trial: one-time in-app purchase unlocks unlimited use of current features
• API usage billed directly by OpenAI (the app does not charge for API usage)
Cost Guide
• With $5, you can transcribe about 14 hours of audio
• Whisper API ≈ $0.006 per minute (≈ $0.36 per hour)
• OpenAI API pricing → https://openai.com/ja-JP/api/pricing/
Models for summaries and meeting minutes
Choose from compact, low-cost models:
• GPT-4.1-nano
• GPT-4.1-mini
• GPT-5-nano
• GPT-5-mini
Even long texts (1,000–2,000 words) can usually be processed for just a few cents per run.
Main Features
• Record with the microphone button and instantly convert to text
• Import audio files (or directly from the share sheet)
• Import video files (audio extracted and compressed automatically)
• Playback-synced highlighting of transcript segments
• Insert timeline markers (configurable in 5-second steps)
• Generate summaries and meeting minutes (prompts editable in Settings)
• OCR transcription from images (supports multiple images, all processed locally with no extra API cost)
• Export audio, text, summaries, minutes, or subtitles (VTT / SRT)
• Automatically post transcripts/summaries/minutes to Slack
• Estimate costs in Settings (based on audio length and character count)
• Other customization options (LLM model, timeline interval, prompts, etc.)
Supported formats
Audio: mp3, m4a, aac, wav, flac, ogg, opus, wma, amr, mpga, webm, aiff, caf
Video: mp4, mov, m4v, webm, mkv, avi, mpeg, mpg
Notes
• An API key (such as OpenAI) is required
• Pricing and available models may change according to OpenAI’s offerings