This task can be performed using GetTxt.AI
Text Extraction API from any File
Best product for this task

GetTxt.AI
dev-tools
Easily extract Text and Markdown from any document, image, video or audio file with one single API Call. Basis for any AI or LLM Application

What to expect from an ideal product
- Upload your audio or video file to GetTxt.AI and let it turn speech into text in minutes
- Handles multiple formats like MP3, MP4, WAV, and other popular media files
- Pick your output format: plain text, markdown, or subtitles for video content
- Converts both spoken words and on-screen text from videos into readable content
- Works through a simple API call, saving you hours of manual transcription work