I Built an Open Source MCP Server to Unlock YouTube for AI Agents

SYS.BLOG

I Built an Open Source MCP Server to Unlock YouTube for AI Agents

An open-source MCP server that lets Claude and other AI agents analyze YouTube videos using Google's Gemini API — summaries, Q&A, transcripts, and frame extraction, no downloading required.

|Aditya Bawankule
MCPTypeScriptGeminiOpen SourceAI Agents

I just built and open sourced a MCP server that lets Claude and other AI agents analyze YouTube videos using Google's Gemini API. YouTube holds an enormous amount of knowledge — tutorials, talks, interviews, walkthroughs — most of it inaccessible to AI agents that can only read text. This server bridges that gap.


What It Does

Pass any YouTube URL and you can:

  • Summarize videos — brief, medium, or detailed with timestamps
  • Ask specific questions about the video content
  • Extract screenshots and frames at specific moments
  • Get a full transcript for any video
  • Search within a video for specific topics or moments

It works with both Claude Code and Claude Desktop — anywhere that supports MCP.


The Interesting Technical Detail

Gemini can analyze YouTube URLs directly. No downloading. No transcription APIs. No chunking video into audio files and piping them through Whisper. You just pass the URL and Gemini handles the rest.

Most other AI models can't do this — they're text-only and have no native way to consume video content. This MCP server essentially proxies that capability, making it available to Claude and any other MCP-compatible agent.

The result is that Claude Code can now answer questions like "what does this conference talk say about distributed systems?" or "summarize this tutorial and give me the key steps" — without you having to copy-paste a transcript first.


Tech Stack

  • TypeScript — the whole server is typed end to end
  • Anthropic's Model Context Protocol (MCP) — the standard for giving AI agents access to external tools
  • Google Gemini API — handles the multimodal video understanding

Get It

The code is on GitHub. PRs and issues welcome — if you find it useful or have ideas for what else it should support, I'd love to hear from you.