Transcription, subtitles, podcast workflows, chaptering, localization, loudness cleanup, and final-mile publishing prep.
- Live page: https://agentskillexchange.com/industry-skills/#media-publishing-systems
- Homepage access: Curated Collections on https://agentskillexchange.com/
- Podcast, video, course, and newsroom teams that need repeatable post-production handoffs.
- Operations leads who want transcripts, subtitles, chapters, and loudness checks before content ships.
- Transcribe long-form audio and video into reviewable text.
- Align subtitles and narration timing after edits.
- Normalize audio loudness and prepare publishing metadata.
- Create searchable media archives for downstream agents.
- Podcast release packet: Transcribe → chapter → summarize → normalize loudness → publish feed notes
- Subtitle repair: Extract transcript → force-align timing → sync drift → export captions
- Source-to-repurpose pipeline: Acquire source media → extract transcript → chapter clips → normalize audio → prepare republishing notes
| Skill | What it does here | Persona | Install | Stars |
|---|---|---|---|---|
| Podcast RSS Feed Transcriber | Starts a release workflow from the source feed by fetching episodes and producing transcript artifacts. | Podcast producer / content ops | Medium | 97.4k |
| Realign drifting subtitles against finished video audio | Fixes subtitle drift after final video edits without manually retiming every cue. | Video editor / localization ops | Medium | 504 |
| Force-align narration and transcript text into subtitle or SMIL timing maps | Turns existing narration plus script text into timed caption or SMIL maps for accessible publishing. | Course producer / accessibility lead | Medium | 2.8k |
| ffsubsync Subtitle Synchronization Tool | Gives teams a fast low-friction subtitle sync step before uploading captions. | Video producer / publishing ops | Low | 7.7k |
| Normalize loudness across podcast, lesson, or video batches before publishing | Prevents inconsistent perceived volume across a batch of episodes, lessons, or social clips. | Audio editor / content QA | Low | 1.5k |
| WhisperX Speech Recognition with Word-Level Timestamps and Diarization | Adds word timestamps and speaker diarization for searchable transcripts, clips, and chaptering. | Media engineer / transcript ops | High | 21k |
| Capture YouTube transcripts without browser automation using YouTube Transcript API | Fetches available YouTube captions without brittle browser automation or scraping a video page. | Researcher / content repurposing lead | Low | 7.4k |
| faster-whisper High-Performance Speech Transcription Engine | Handles high-volume transcription runs where speed and local control matter more than a hosted API. | Media automation engineer | Medium | 21.9k |
| Deepgram Real-Time Transcription Connector | Covers live captions and streaming call transcription where batch Whisper-style workflows are too late. | Live production / call intelligence engineer | High | 260 |
| AssemblyAI Summarization & Chapters Skill | Converts transcripts into chapter summaries for show notes, lesson outlines, and archive navigation. | Podcast producer / editorial ops | Medium | — |
| YouTube Chapters Generator with Whisper | Turns YouTube source audio into timestamped chapters so editors can build navigable videos, show notes, and repurposing briefs. | Video producer / publishing ops | Medium | 97.8k |
| Summarize URLs, files, podcasts, and YouTube sources into agent-ready briefs with Summarize | Creates compact briefs from URLs, files, podcasts, and YouTube sources before downstream scripting, clipping, or editorial review. | Content strategist / research producer | Medium | 5.6k |
| Self-host an OpenAI-compatible speech API for local transcription, translation, and TTS with Speaches | Lets teams keep OpenAI-compatible speech workflows while self-hosting audio processing. | Platform engineer / privacy-conscious media team | High | 3.2k |
| yt-dlp Feature-Rich Audio and Video Downloader CLI | Adds the standard audio/video source acquisition layer before transcription, captioning, chaptering, or repurposing workflows. | Media ops / content producer | Medium | 154.3k |
| Analyze videos with frame extraction and audio context in Claude Code | Extracts frames and audio context from videos so editors and researchers can build reviewable media summaries before publishing. | Video producer / media researcher | Medium | 698 |
| Whisper Subtitle Generator | Creates subtitle files from speech audio for teams that need caption artifacts before video or course release. | Video editor / accessibility ops | Medium | 97.8k |
| Whishper Self-Hosted Speech-to-Text and Audio Workflow Skill | Adds a self-hosted speech-to-text workflow for privacy-conscious media teams handling interviews, lessons, or internal recordings. | Media platform engineer / transcript ops | High | 3k |
- Prioritize tools that create durable artifacts an editor can inspect.
- Keep fully automated publishing behind human review for final titles, clips, and claims.
- Prefer workflow-shaped entries over bare libraries when possible.