Tell an agent.
Get a finished video.
VideoPilot is an open-source MCP server that gives any LLM 20 tools to author voiceover, cut highlights, compose timelines, and render finished MP4s — straight from your screen recordings.
pip install --user videopilotGenerated end-to-end by VideoPilot itself.
The whole pipeline, exposed as typed tools.
Each stage in VideoPilot is an MCP tool with a JSON-schema contract. That means agents can author voiceovers, draft cut plans, and compose timelines deterministically — no prompt-engineering the editor.
Agent-driven by design
Wired for MCP. Any agent — GitHub Copilot CLI, Claude Desktop, Cursor — drives the whole pipeline through 20 typed tool calls.
"command": "uvx",
"args": ["--from", "videopilot",
"videopilot-mcp"]400+ neural voices
Free Microsoft Edge TTS by default across 100+ locales. Drop in an Azure key for premium neural voices.
Word-level transcription
Local faster-whisper produces precise word timings and SRT, ready for highlight selection or burn-in captions.
ffmpeg under the hood
Filter graphs you don't have to author. Slides, picture-in-picture, ducking, and music underlay just compose.
Subpixel Ken Burns
Zoom and pan over still images, rendered with Lanczos oversampling so the motion stays buttery, not jittery.
Hand off to any NLE
Export the same timeline as EDL (CMX 3600) and FCPXML — open it in Premiere, Resolve, or Final Cut.
Composable timeline
Voiceover segments, clips, slides, motion, music, and ducking all live in a single declarative compose-plan.json.
Idempotent re-runs
Probe whether each stage's outputs are stale, then regenerate only what changed. CI-friendly.
Six stages. One pipeline.
Each stage reads and writes a JSON state file. Agents author the state, VideoPilot does the rendering — and any stage can be re-run on its own.
- Scriptscript.json
- TTSvoiceover MP3s
- Cutcut-plan.json
- Composecompose-plan.json
- Finalfinal.mp4
- ExportEDL · FCPXML
- 1. Scriptscript.json
- 2. TTSvoiceover MP3s
- 3. Cutcut-plan.json
- 4. Composecompose-plan.json
- 5. Finalfinal.mp4
- 6. ExportEDL · FCPXML
20 tools for the calling LLM.
The Model Context Protocol is the open standard for connecting LLM clients to external tools. Wire VideoPilot in once and every MCP-aware agent gets the same 20 tools.
Two minutes to your first MCP call.
Install once from PyPI, or wire VideoPilot straight into your MCP client with uvx — no global install needed.
Installs the videopilot CLI and the videopilot-mcpserver entry point. You'll also need ffmpeg on PATH.
{
"mcpServers": {
"videopilot": {
"type": "stdio",
"command": "uvx",
"args": ["--from", "videopilot", "videopilot-mcp"],
"tools": ["*"]
}
}
}uvx fetches videopilot from PyPI into an ephemeral env and runs the MCP server over stdio. Restart your MCP client and the 20 tools appear.
$ videopilot doctorChecking environment...[OK] ffmpeg 7.1[OK] ffprobe 7.1[OK] edge-tts ready[OK] whisper base[skip] azure no key (optional)All required checks passed.videopilot doctor exits 0 when every required dep is in place, and prints exactly what's missing otherwise.