YouTube Transcript Downloader
Extract clean, readable transcripts from YouTube videos in seconds. Perfect for studying tutorials, converting talks into notes, or creating written content from video sources.
Essential for learners, researchers, and content creators who want to reference video content in text form without manual transcription.
Core Functionality
Two-mode operation:
- Subtitle Download (fast) - Grabs existing captions
- Whisper Transcription (slow but comprehensive) - Generates transcript from audio
Automatic deduplication removes the repetitive lines common in VTT subtitle formatting.
How It Works
Intelligent Fallback System
Primary Method:
- Downloads official subtitles via
yt-dlp - Checks for multiple language options
- Preserves timing and structure
- Lightning fast (seconds)
Fallback Method:
- Downloads audio with
yt-dlp - Transcribes with OpenAI Whisper
- Works on any video (even without captions)
- Slower (minutes depending on video length)
Post-Processing:
- Removes duplicate lines from VTT format
- Cleans up formatting artifacts
- Creates readable paragraph structure
- Names file using video title
Usage Examples
Basic download:
Download the transcript for https://youtube.com/watch?v=dQw4w9WgXcQ
Specify language:
Get the Spanish transcript from this video: [URL]
Batch processing:
Download transcripts for these tutorial videos:
- https://youtube.com/watch?v=tutorial1
- https://youtube.com/watch?v=tutorial2
- https://youtube.com/watch?v=tutorial3
Output Format
Generated filename: [Video Title] - Transcript.txt
Content structure:
[Clean paragraph-style transcript with duplicate lines removed]
Natural reading flow without VTT timestamp artifacts.
Preserves sentence structure and logical breaks.
Installation
Required (core functionality):
# macOS
brew install yt-dlp
# Ubuntu/Debian
sudo apt install yt-dlp
# pip (cross-platform)
pip3 install yt-dlp
Optional (for videos without subtitles):
pip3 install openai-whisper
Whisper Model Sizes:
tiny- 1GB, fast but less accuratebase- 1GB, good balancesmall- 2GB, better accuracymedium- 5GB, high qualitylarge- 10GB, best quality (slow)
For most use cases, base or small models provide excellent results.
Technical Details
Subtitle Detection:
yt-dlp --list-subs [URL]
Shows available caption languages before download.
Download Process:
yt-dlp --write-auto-sub --skip-download [URL]
Grabs auto-generated or manual subtitles without downloading video.
Whisper Fallback:
yt-dlp -x --audio-format mp3 [URL]
whisper audio.mp3 --model base
Extracts audio and generates transcript using AI.
Deduplication Algorithm
Problem: VTT subtitle files contain duplicates for accessibility
00:00:01 --> 00:00:03
Welcome to this tutorial
00:00:01 --> 00:00:03
Welcome to this tutorial
00:00:03 --> 00:00:05
Today we'll learn about React
Solution: Python script removes consecutive duplicates
Welcome to this tutorial
Today we'll learn about React
Best Practices
Do:
- Use subtitle download when available (faster)
- Choose appropriate Whisper model for speed/quality tradeoff
- Review transcript for technical terms (AI may misinterpret jargon)
- Respect video creator's copyright
Don't:
- Download transcripts from copyrighted content for commercial redistribution
- Assume 100% accuracy (especially with auto-generated subtitles)
- Use largest Whisper model unless quality is critical (very slow)
- Skip checking if subtitles already exist before using Whisper
Common Use Cases
For Students:
- Tutorial videos → Study notes
- Lecture recordings → Reference material
- Conference talks → Written summaries
For Researchers:
- Interview videos → Analysis data
- Documentary content → Citations
- Expert talks → Quote extraction
For Content Creators:
- Competitor analysis → Written breakdowns
- Video scripts → Blog post foundations
- Podcast episodes → Show notes
Language Support
Subtitle download supports:
- All languages offered by YouTube creators
- Auto-generated captions in major languages
- Manual captions in creator-specified languages
Whisper transcription supports:
- 99+ languages with varying accuracy
- Best performance on English
- Automatic language detection
- Translations available with
--task translate
Integration with Tapestry
Complete Learning Workflow
YouTube Transcript is part of the Tapestry ecosystem:
Standalone usage: Extract transcripts for any purpose
With Tapestry orchestration:
- YouTube Transcript - Get clean text ← You are here
- Ship-Learn-Next - Convert to action plan
- Ship - Build something concrete
One command for full workflow:
tapestry https://youtube.com/watch?v=example
Performance
Subtitle download:
- Speed: 5-15 seconds (network dependent)
- Accuracy: Varies (auto-captions vs. manual)
- Cost: Free
Whisper transcription:
- Speed:
tinymodel: 2-5x real-timebasemodel: 1-2x real-timelargemodel: 0.5x real-time (slower than video)
- Accuracy: Excellent (especially
medium+) - Cost: Free (runs locally)
Recommendation: Try subtitles first, use Whisper only when needed.
Troubleshooting
"No subtitles available" error:
- Check
yt-dlp --list-subs [URL]to verify - Install Whisper for fallback transcription
- Some videos genuinely have no captions
Whisper fails with memory error:
- Use smaller model (
tinyorbase) - Close other applications
- Process shorter video segments
Duplicate lines not removed:
- Check Python is installed (needed for deduplication script)
- Manually clean with find/replace in text editor
Wrong language downloaded:
- Specify language:
yt-dlp --write-sub --sub-lang es [URL] - List available languages:
yt-dlp --list-subs [URL]
Advanced Options
Download specific subtitle format:
yt-dlp --write-sub --sub-format vtt/srt [URL]
Translate to English:
whisper audio.mp3 --model base --task translate
Keep timestamps:
yt-dlp --write-sub --skip-download [URL]
# Use raw VTT file with timestamps preserved
About This Skill
This skill was created by michalparkola as part of the Tapestry Skills for Claude Code collection.
Philosophy: Transform passive video watching into active learning by extracting transcripts that can be turned into action plans, study guides, or reference materials.
Tools used: yt-dlp (subtitle download), OpenAI Whisper (transcription), Python (text processing)
Downloads and processes YouTube transcripts for readability, with automatic deduplication and fallback to Whisper transcription for videos without subtitles.