MCP · 20 tools · MIT

Tell an agent. Get a finished video.

VideoPilot is an open-source MCP server that gives any LLM 20 tools to author voiceover, cut highlights, compose timelines, and render finished MP4s — straight from your screen recordings.

PyPIpip install --user videopilot

Generated end-to-end by VideoPilot itself.

Features

The whole pipeline, exposed as typed tools.

Each stage in VideoPilot is an MCP tool with a JSON-schema contract. That means agents can author voiceovers, draft cut plans, and compose timelines deterministically — no prompt-engineering the editor.

Agent-driven by design

Wired for MCP. Any agent — GitHub Copilot CLI, Claude Desktop, Cursor — drives the whole pipeline through 20 typed tool calls.

"command": "uvx",
"args": ["--from", "videopilot",
         "videopilot-mcp"]

400+ neural voices

Free Microsoft Edge TTS by default across 100+ locales. Drop in an Azure key for premium neural voices.

Word-level transcription

Local faster-whisper produces precise word timings and SRT, ready for highlight selection or burn-in captions.

ffmpeg under the hood

Filter graphs you don't have to author. Slides, picture-in-picture, ducking, and music underlay just compose.

Subpixel Ken Burns

Zoom and pan over still images, rendered with Lanczos oversampling so the motion stays buttery, not jittery.

Hand off to any NLE

Export the same timeline as EDL (CMX 3600) and FCPXML — open it in Premiere, Resolve, or Final Cut.

Composable timeline

Voiceover segments, clips, slides, motion, music, and ducking all live in a single declarative compose-plan.json.

Idempotent re-runs

Probe whether each stage's outputs are stale, then regenerate only what changed. CI-friendly.

Pipeline

Six stages. One pipeline.

Each stage reads and writes a JSON state file. Agents author the state, VideoPilot does the rendering — and any stage can be re-run on its own.

  1. 1. Script
    script.json
  2. 2. TTS
    voiceover MP3s
  3. 3. Cut
    cut-plan.json
  4. 4. Compose
    compose-plan.json
  5. 5. Final
    final.mp4
  6. 6. Export
    EDL · FCPXML
MCP tools

20 tools for the calling LLM.

doctorvoiceslist_projectsproject_statusinitimport_sourceread_statewrite_statettstranscribesilencecutcomposeexportschemaadd_vo_segmentadd_slideset_compose_outputpreview_slideis_up_to_datedoctorvoiceslist_projectsproject_statusinitimport_sourceread_statewrite_statettstranscribesilencecutcomposeexportschemaadd_vo_segmentadd_slideset_compose_outputpreview_slideis_up_to_date

The Model Context Protocol is the open standard for connecting LLM clients to external tools. Wire VideoPilot in once and every MCP-aware agent gets the same 20 tools.

Install

Two minutes to your first MCP call.

Install once from PyPI, or wire VideoPilot straight into your MCP client with uvx — no global install needed.

$pip install --user videopilot

Installs the videopilot CLI and the videopilot-mcpserver entry point. You'll also need ffmpeg on PATH.

Verify
videopilot doctor
$ videopilot doctor
Checking environment...
[OK] ffmpeg 7.1
[OK] ffprobe 7.1
[OK] edge-tts ready
[OK] whisper base
[skip] azure no key (optional)
All required checks passed.

videopilot doctor exits 0 when every required dep is in place, and prints exactly what's missing otherwise.