Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Development skills for AI coding agents. Plug into your favorite AI coding tool
| `ios-application-dev` | iOS application development guide covering UIKit, SnapKit, and SwiftUI. Touch targets, safe areas, navigation patterns, Dynamic Type, Dark Mode, accessibility, collection views, and Apple HIG compliance. |
| `shader-dev` | Comprehensive GLSL shader techniques for creating stunning visual effects — ray marching, SDF modeling, fluid simulation, particle systems, procedural generation, lighting, post-processing, and more. ShaderToy-compatible. |
| `gif-sticker-maker` | Convert photos (people, pets, objects, logos) into 4 animated GIF stickers with captions. Funko Pop / Pop Mart style, powered by MiniMax Image & Video Generation API. |
| `podcast-creator` | Convert text scripts into podcast episodes with narration and music. Supports plain text, Markdown, or structured JSON input. Generates narration via MiniMax TTS API and intro/outro music via MiniMax Music API, then assembles with ffmpeg. |
| `minimax-pdf` | Generate, fill, and reformat PDF documents with a token-based design system. CREATE polished PDFs from scratch (15 cover styles), FILL existing form fields, or REFORMAT documents into a new design. Print-ready output with typography and color derived from document type. |
| `pptx-generator` | Generate, edit, and read PowerPoint presentations. Create from scratch with PptxGenJS (cover, TOC, content, section divider, summary slides), edit existing PPTX via XML workflows, or extract text with markitdown. |
| `minimax-xlsx` | Open, create, read, analyze, edit, or validate Excel/spreadsheet files (.xlsx, .xlsm, .csv, .tsv). Covers creating new xlsx from scratch via XML templates, reading and analyzing with pandas, editing existing files with zero format loss, formula recalculation, validation, and professional financial formatting. |
Expand Down
1 change: 1 addition & 0 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
| `ios-application-dev` | iOS 应用开发指南,涵盖 UIKit、SnapKit 和 SwiftUI。触控目标、安全区域、导航模式、Dynamic Type、深色模式、无障碍、集合视图,符合 Apple HIG 规范。 |
| `shader-dev` | 全面的 GLSL 着色器技术,用于创建惊艳的视觉效果 — 光线行进、SDF 建模、流体模拟、粒子系统、程序化生成、光照、后处理等。兼容 ShaderToy。 |
| `gif-sticker-maker` | 将照片(人物、宠物、物品、Logo)转换为 4 张带字幕的动画 GIF 贴纸。Funko Pop / Pop Mart 盲盒风格,基于 MiniMax 图片与视频生成 API。 |
| `podcast-creator` | 将文本脚本转换为播客节目。支持纯文本、Markdown 或 JSON 格式输入。通过 MiniMax TTS API 生成旁白,通过 MiniMax 音乐生成 API 创建片头/片尾音乐,使用 ffmpeg 合成最终音频。 |
| `minimax-pdf` | 基于 token 化设计系统生成、填写和重排 PDF 文档。支持三种模式:CREATE(从零生成,15 种封面风格)、FILL(填写现有表单字段)、REFORMAT(将已有文档重排为新设计)。排版与配色由文档类型自动推导,输出即可打印。 |
| `pptx-generator` | 生成、编辑和读取 PowerPoint 演示文稿。支持用 PptxGenJS 从零创建(封面、目录、内容、分节页、总结页),通过 XML 工作流编辑现有 PPTX,或用 markitdown 提取文本。 |
| `minimax-xlsx` | 打开、创建、读取、分析、编辑或验证 Excel/电子表格文件(.xlsx、.xlsm、.csv、.tsv)。支持通过 XML 模板从零创建 xlsx、使用 pandas 读取分析、零格式损失编辑现有文件、公式重算与验证、专业财务格式化。 |
Expand Down
153 changes: 153 additions & 0 deletions skills/podcast-creator/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
---
name: podcast-creator
description: |
Convert text scripts into podcast episodes using MiniMax TTS and Music APIs.
Use when: creating podcasts from text, generating audio narration with background music,
converting articles or blog posts to audio, producing voiceover content with intro/outro music.
Triggers: podcast, audio episode, narration, text-to-speech with music, voiceover, audio content.
license: MIT
metadata:
version: "1.0"
category: creative-tools
output_format: mp3
sources:
- MiniMax Text-to-Speech API (speech-2.8-hd)
- MiniMax Music Generation API (music-2.5+)
---

# Podcast Creator

Convert text scripts into polished podcast episodes with narration and music.

## Prerequisites

Before starting, ensure:

1. **Python venv** is activated with dependencies from [requirements.txt](references/requirements.txt) installed
2. **`MINIMAX_API_KEY`** is exported (e.g. `export MINIMAX_API_KEY='your-key'`)
3. **`ffmpeg`** is available on PATH (for audio assembly)

If any prerequisite is missing, set it up first. Do NOT proceed without all three.

## Workflow

### Step 0: Collect Script

Accept the podcast script in one of three formats:

1. **Plain text** - Split into segments by blank lines or headings
2. **Markdown** - Use headings as chapter markers
3. **Structured JSON** - See [script-format.md](references/script-format.md) for the schema

If the user provides plain text or markdown, convert it to the internal JSON structure before proceeding.

Ask the user:
> "Do you have a podcast script ready, or would you like me to help write one?"

If the user wants help writing a script, ask for the topic and target length (short ~2min, medium ~5min, long ~10min), then draft chapters.

### Step 1: Voice Selection

Present voice options based on content type. Reference the [MiniMax Voice Catalog](../frontend-dev/references/minimax-voice-catalog.md) for the full list.

**Quick picks by content type:**

| Content type | Recommended voice | Voice ID |
|---|---|---|
| Tech / tutorial | Male, clear, neutral | `male-qn-qingse` |
| Storytelling | Male, warm, narrative | `male-qn-jingying` |
| News / formal | Female, professional | `female-shaonv` |
| Conversational | Female, friendly | `female-yujie` |

Ask the user:
> "Which voice style fits your podcast? Pick from the suggestions above or describe what you want."

Record the selected `voice_id` for Step 2.

### Step 2: Generate Narration

**Tool**: `scripts/minimax_tts.py`

For each chapter/segment in the script:

```bash
python3 scripts/minimax_tts.py "CHAPTER_TEXT" -o output/chapter_01.mp3 -v VOICE_ID --speed 0.95
```

**Tips:**
- Use `--speed 0.95` for narration (slightly slower than default for clarity)
- Keep each TTS call under 10,000 characters. Split longer chapters.
- Generate chapters sequentially to maintain consistent pacing.

After generation, verify each file exists:
```bash
ls -la output/chapter_*.mp3
```

### Step 3: Generate Music

**Tool**: `scripts/minimax_music.py`

Generate intro and outro music based on the podcast tone.

```bash
# Intro music (instrumental, 15-30 seconds feel)
python3 scripts/minimax_music.py --prompt "MUSIC_STYLE, short intro jingle" --instrumental -o output/intro_music.mp3

# Outro music (instrumental, fade-out feel)
python3 scripts/minimax_music.py --prompt "MUSIC_STYLE, gentle outro, fade out" --instrumental -o output/outro_music.mp3
```

**Music style examples:**
- Tech podcast: "Electronic ambient, minimal, modern, professional"
- Story podcast: "Acoustic guitar, warm, intimate, indie folk"
- News podcast: "Clean piano, confident, broadcast quality"
- Casual podcast: "Lo-fi beats, relaxed, warm, coffee shop"

### Step 4: Assemble

**Tool**: `scripts/podcast_create.py`

Stitch all audio segments into a final episode:

```bash
python3 scripts/podcast_create.py \
--intro output/intro_music.mp3 \
--chapters output/chapter_01.mp3 output/chapter_02.mp3 output/chapter_03.mp3 \
--outro output/outro_music.mp3 \
--title "Episode Title" \
-o output/episode.mp3
```

This handles:
- Crossfading intro music into first chapter (2s overlap)
- Adding 1s silence between chapters
- Crossfading last chapter into outro music (2s overlap)
- Writing ID3 tags (title, artist) to the final mp3

### Step 5: Deliver

Output format:
1. Summary line: "Podcast episode created: {title}"
2. File path and duration
3. Chapter breakdown with timestamps

```
Podcast episode created: "Episode Title"
File: output/episode.mp3
Duration: 5m 23s
Chapters:
00:00 - Intro
00:08 - Chapter 1: Introduction
01:45 - Chapter 2: Main Topic
04:10 - Chapter 3: Wrap-up
05:05 - Outro
```

## Rules

- Always generate intro and outro music. A podcast without music sounds unfinished.
- Use `--instrumental` for all music generation. Vocals in background music compete with narration.
- Keep intro music short (the generated clip will be ~30s, but crossfading trims it naturally).
- Detect user's language and match the TTS voice language accordingly.
- All music prompts must be in **English** regardless of user language.
1 change: 1 addition & 0 deletions skills/podcast-creator/references/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
requests>=2.28.0
81 changes: 81 additions & 0 deletions skills/podcast-creator/references/script-format.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Podcast Script Format

The podcast-creator skill accepts scripts in three formats.

## Format 1: Plain Text

Separate segments with blank lines. The first line becomes the title.

```
My Podcast Episode

Welcome to the show. Today we talk about AI-generated audio content.

The main topic is how text-to-speech and music generation APIs
can work together to produce complete podcast episodes.

Thanks for listening. See you next time.
```

## Format 2: Markdown

Use headings as chapter markers.

```markdown
# My Podcast Episode

## Introduction
Welcome to the show. Today we talk about AI-generated audio content.

## Main Topic
The main topic is how text-to-speech and music generation APIs
can work together to produce complete podcast episodes.

## Wrap-up
Thanks for listening. See you next time.
```

## Format 3: Structured JSON

Full control over chapters, voice, and music style.

```json
{
"title": "My Podcast Episode",
"voice_id": "male-qn-qingse",
"music_style": "Lo-fi beats, relaxed, warm",
"chapters": [
{
"type": "intro",
"title": "Introduction",
"text": "Welcome to the show. Today we talk about AI-generated audio content."
},
{
"type": "segment",
"title": "Main Topic",
"text": "The main topic is how text-to-speech and music generation APIs can work together to produce complete podcast episodes."
},
{
"type": "outro",
"title": "Wrap-up",
"text": "Thanks for listening. See you next time."
}
]
}
```

### Chapter types

| Type | Description |
|------|-------------|
| `intro` | Opening segment. Narrated over intro music fade. |
| `segment` | Main content chapter. |
| `outro` | Closing segment. Fades into outro music. |

### Optional fields

| Field | Default | Description |
|-------|---------|-------------|
| `voice_id` | `male-qn-qingse` | MiniMax voice ID for narration |
| `music_style` | `"Ambient, professional, modern"` | Prompt for music generation |
| `speed` | `0.95` | Narration speed (0.5-2.0) |
Loading