Speak AI CLI -- Transcribe, Analyze & Search from Your Terminal
The Speak AI CLI gives you 26 commands for transcription, NLP analysis, media management, and AI chat directly from your terminal. Every command supports --json for scripting and piping.
Install in one command
The CLI ships with the same npm package as the MCP server. Install globally, run the init wizard, and start using all 26 commands immediately.
npm install -g @speakai/mcp-server
# Initialize and set your API key
speakai-mcp init
26 commands across 3 categories
Upload, transcribe, analyze, search, organize, and export media from your terminal. Every command supports --json output for piping to other tools.
11 commands
Media Management
upload, list-media, get-transcript, get-insights, status, export, update, delete, favorites, captions, reanalyze. Upload local files or URLs, pull transcripts and NLP insights, export in any format, and manage your entire media library.
3 commands
AI and Search
ask lets you query any media, folder, or your entire workspace with AI. chat-history lists past AI conversations. search does full-text search across all transcripts and insights. Pipe results to jq, grep, or your own scripts.
12 commands
Organization and Automation
list-folders, create-folder, clips, clip, stats, languages, schedule-meeting, create-text, config, init. Organize files into folders, create highlight clips, schedule meeting bots, and manage configuration.
Real commands, real workflows
Six commands that show what the CLI can do. Upload recordings, pull transcripts, query with AI, search across your library, export to PDF, and pipe JSON output to other tools.
speakai-mcp upload ./interview.mp3 -n "Q1 Interview" --wait
# Get plain-text transcript and save to file
speakai-mcp transcript abc123 --plain > meeting.txt
# Ask AI about a specific recording
speakai-mcp ask "What were the action items?" -m abc123
# Search all transcripts from this year
speakai-mcp search "pricing concerns" --from 2026-01-01
# Export as PDF with speaker names
speakai-mcp export abc123 -f pdf --speakers
# List videos as JSON and pipe to jq
speakai-mcp ls --type video --json | jq '.mediaList[].name'
Built for automation and scale
The CLI turns Speak AI into a scriptable media intelligence engine. Here is how teams use it.
Batch transcription
Upload an entire folder of recordings and process them overnight. Use a shell loop with upload --wait to transcribe hundreds of files sequentially or in parallel. Pull transcripts and insights when processing completes.
CI/CD integration
Add transcription and analysis steps to your build pipeline. Transcribe product demo recordings on every release. Run NLP analysis on customer call recordings as part of your data pipeline. All output is JSON-native.
Research workflows
Search across hundreds of interviews with search. Ask questions across your entire library with ask. Export findings as PDF or CSV. Build reproducible research pipelines that run from a single script.
Automated reporting
Set up cron jobs to pull weekly meeting summaries. Use stats to track workspace activity. Pipe JSON output to Python scripts that generate custom reports and dashboards for your team.
What the Speak AI CLI does and who it is for
The Speak AI CLI is a command-line interface that gives developers, researchers, and power users direct terminal access to the full Speak AI platform. Instead of uploading files through a web browser, navigating dashboards, and clicking through menus, you run a single command. Upload a recording, get a transcript, search across your library, ask AI questions about your data, and export results in any format. All from your terminal, all scriptable, all with JSON output for piping to other tools.
The CLI ships as part of the @speakai/mcp-server npm package. Install it globally with npm install -g @speakai/mcp-server, run speakai-mcp init to set your API key, and you have 26 commands ready to use. The same package also includes the MCP server with 81 tools for AI assistants like Claude, ChatGPT, Cursor, and Windsurf. Both the CLI and MCP server share the same API key and access the same workspace data.
How the CLI differs from the web interface
The Speak AI web interface at app.speakai.co is designed for interactive use: browse your library, play recordings, read transcripts, and explore insights visually. The CLI is designed for automation and efficiency. It excels at batch operations, scripting, and integration with other tools. Upload 200 files in a loop. Search across your entire library and pipe results to grep. Export every recording in a folder as PDF. These workflows are either impractical or impossible through a web interface but straightforward from the command line.
Every command supports --json output, making it easy to integrate with jq, Python, Node.js, or any other tool in your stack. The CLI also supports --plain output for human-readable results when you are working interactively. You can mix both approaches: use the web interface for visual exploration and the CLI for automation and batch work.
How the CLI relates to the MCP server
The CLI and the MCP server are complementary tools packaged together. The MCP server provides 81 tools that AI assistants call during conversation. You describe what you want in natural language, and your AI assistant orchestrates the right tool calls. The CLI provides 26 commands that you invoke directly. You type the exact command, pass the exact arguments, and get deterministic results. Use the MCP server when you want your AI to figure out the workflow. Use the CLI when you already know exactly what you want to do and need it to happen the same way every time.
Language support and transcription
The CLI supports transcription in over 70 languages with automatic language detection. Speaker diarization identifies who said what. Word-level timestamps enable precise alignment. When you upload a file with speakai-mcp upload, the platform automatically transcribes it and runs NLP analysis: sentiment, keywords, themes, and named entities. You can retrieve these results with get-transcript and get-insights, or query them with ask using AI Chat powered by Claude, Gemini, or GPT.
Scripting and automation examples
Teams use the CLI to build automated workflows that would be impractical through a GUI. A research team might write a bash script that uploads all interview recordings from a shared drive, waits for processing, then searches across transcripts for specific themes and exports the results as a CSV. A DevOps team might add a step to their CI/CD pipeline that transcribes product demo recordings and pushes summaries to Slack. A consulting firm might run a cron job every Monday that pulls the previous week's meeting recordings, generates summaries via ask, and emails a consolidated brief to the team. The --json flag on every command means the CLI integrates cleanly into any scripting language or automation tool.
Getting started
Install the CLI with npm install -g @speakai/mcp-server. Run speakai-mcp init to enter your API key. Then try speakai-mcp ls to see your media library, speakai-mcp upload ./file.mp3 --wait to transcribe a recording, and speakai-mcp ask "Summarize this recording" to query it with AI. Full documentation and source code are on GitHub. The CLI is free and open source under the MIT license. You need a Speak AI account to authenticate. Full API documentation is at docs.speakai.co. See the developers page for the complete platform integration story, including the REST API, webhooks, embeddable widgets, and white-label options.
Frequently asked questions
How do I install the Speak AI CLI?
Install globally from npm with npm install -g @speakai/mcp-server. Then run speakai-mcp init to set your API key. The CLI is included in the same package as the MCP server. You need Node.js 18 or later. The package is free and open source under the MIT license. View it on npm or GitHub.
What is the difference between the CLI and the MCP server?
The CLI provides 26 commands you run directly in your terminal. The MCP server provides 81 tools that AI assistants like Claude, ChatGPT, Cursor, and Windsurf call during conversation. Both ship in the same npm package and share the same API key. Use the CLI for scripting, automation, and deterministic workflows. Use the MCP server when you want your AI assistant to orchestrate complex, multi-step tasks through natural language.
Can I use the CLI in scripts and automation?
Yes. Every command supports --json output for piping to other tools like jq, Python scripts, or CI/CD pipelines. You can use the CLI in bash scripts, cron jobs, build pipelines, and any automation workflow. Common patterns include batch uploading folders of recordings, scheduled reporting, and automated transcript exports.
What audio and video formats does the CLI support?
The CLI supports all major audio and video formats including MP3, MP4, WAV, M4A, FLAC, OGG, WebM, MOV, AVI, and MKV. You can upload local files or provide URLs. The platform handles format conversion and processing automatically. There is no need to pre-convert files before uploading.
Is the CLI free?
The CLI itself is free and open source under the MIT license. You need a Speak AI account to authenticate and use the commands. API access is available on all paid plans, and you get full access during the free 7-day trial with no credit card required. See pricing for plan details.
How do I authenticate?
Sign up at app.speakai.co and copy your API key from account settings. Run speakai-mcp config set-key or speakai-mcp init to store it locally. The key is saved in your user config directory and used for all subsequent commands. You can rotate your key at any time from account settings.
Start using Speak AI from your terminal
26 commands for transcription, NLP analysis, AI chat, and media management. Install in one command, script everything, pipe JSON output anywhere.
Try Speak Free
Create an account, grab your API key, and start running commands. Full access during the 7-day trial. No credit card required.
View Documentation
Full README with setup guide, command reference, and examples. Open source under MIT. Inspect the code, file issues, and contribute.
Transcribe, analyze, and search from your terminal
Join 250,000+ people and teams using Speak AI. Install the CLI and start running commands in under 2 minutes.





