MiniMax-AI · HeteroCat · Mar 22, 2026 · Mar 24, 2026 · Mar 24, 2026 · Mar 24, 2026
diff --git a/README.md b/README.md
@@ -20,6 +20,7 @@ Development skills for AI coding agents. Plug into your favorite AI coding tool
 | `pptx-generator` | Generate, edit, and read PowerPoint presentations. Create from scratch with PptxGenJS (cover, TOC, content, section divider, summary slides), edit existing PPTX via XML workflows, or extract text with markitdown. | Official |
 | `minimax-xlsx` | Open, create, read, analyze, edit, or validate Excel/spreadsheet files (.xlsx, .xlsm, .csv, .tsv). Covers creating new xlsx from scratch via XML templates, reading and analyzing with pandas, editing existing files with zero format loss, formula recalculation, validation, and professional financial formatting. | Official |
 | `minimax-docx` | Professional DOCX document creation, editing, and formatting using OpenXML SDK (.NET). Three pipelines: create new documents from scratch, fill/edit content in existing documents, or apply template formatting with XSD validation gate-check. | Official |
+| `minimax-voice` | MiniMax voice synthesis and music generation toolkit. Text-to-speech (sync/async), voice management (query/clone/design), and music generation from lyrics. | Official |
 
 ## Installation
 

diff --git a/README_zh.md b/README_zh.md
@@ -20,6 +20,7 @@
 | `pptx-generator` | 生成、编辑和读取 PowerPoint 演示文稿。支持用 PptxGenJS 从零创建（封面、目录、内容、分节页、总结页），通过 XML 工作流编辑现有 PPTX，或用 markitdown 提取文本。 | Official |
 | `minimax-xlsx` | 打开、创建、读取、分析、编辑或验证 Excel/电子表格文件（.xlsx、.xlsm、.csv、.tsv）。支持通过 XML 模板从零创建 xlsx、使用 pandas 读取分析、零格式损失编辑现有文件、公式重算与验证、专业财务格式化。 | Official |
 | `minimax-docx` | 基于 OpenXML SDK（.NET）的专业 DOCX 文档创建、编辑与排版。三条流水线：从零创建新文档、填写/编辑现有文档内容、应用模板格式并通过 XSD 验证门控检查。 | Official |
+| `minimax-voice` | MiniMax 语音合成与音乐生成工具集。支持文本转语音（同步/异步）、音色管理（查询/复刻/设计）、根据歌词生成音乐。 | Official |
 
 ## 安装
 

diff --git a/skills/minimax-voice/SKILL.md b/skills/minimax-voice/SKILL.md
@@ -0,0 +1,92 @@
+---
+name: minimax-voice
+description: MiniMax voice synthesis and music generation API toolkit. Supports text-to-speech (sync/async), voice management (query/clone/design), and music generation. Use this skill when users need voice synthesis, voice cloning, or music generation.
+license: MIT
+metadata:
+  version: "1.0"
+  category: api-integration
+---
+
+# MiniMax Voice Toolkit
+
+Python client toolkit for MiniMax voice synthesis and music generation APIs.
+
+## Environment Variables
+
+**⚠️ Important: Before each use, check if the API Key environment variable is set. If not, configure it first before calling the scripts.**
+
+```bash
+export MINIMAX_API_KEY="your_api_key_here"
+```
+
+**Default Output Directory**: All generated audio files are automatically saved to `./assets/audios/` (auto-created)
+
+## Scripts
+
+| Script | Function | API |
+|-----|------|-----|
+| `scripts/text_to_audio.py` | Synchronous TTS | `/v1/t2a_v2` |
+| `scripts/text_to_audio_async.py` | Asynchronous TTS | `/v1/t2a_async_v2` |
+| `scripts/voice_manager.py` | Voice Management | `/v1/get_voice`, `/v1/voice_clone`, `/v1/voice_design` |
+| `scripts/music_generation.py` | Music Generation | `/v1/music_generation` |
+
+## Character Limits
+
+| Script | Character Limit | Use Case |
+|------|---------|---------|
+| `text_to_audio.py` (sync) | ≤ 10,000 chars | Short text, real-time synthesis |
+| `text_to_audio_async.py` (async) | 10,001 - 50,000 chars | Long text, audiobooks |
+
+**Note**: Texts exceeding 50,000 characters need to be split into multiple requests.
+
+## Usage Examples
+
+```bash
+# Synchronous TTS (≤ 10000 chars)
+python3 scripts/text_to_audio.py -t "Hello" -v male-qn-qingse -o output.mp3
+
+# Asynchronous TTS (10001-50000 chars)
+python3 scripts/text_to_audio_async.py -t "Long text..." -v audiobook_male_1 -w -o output.mp3
+
+# List voices
+python3 scripts/voice_manager.py list
+
+# Clone voice
+python3 scripts/voice_manager.py clone --file voice.mp3 --voice-id MyVoice001
+
+# Design voice
+python3 scripts/voice_manager.py design --prompt "Warm female voice" --preview "Preview text" -o trial.mp3
+
+# Generate music
+python3 scripts/music_generation.py -l lyrics.txt -p "Pop music, upbeat" -o song.mp3
+```
+
+## Supported Models
+
+### Text-to-Speech
+- `speech-2.8-hd` - Latest HD model, supports interjection tags
+- `speech-2.8-turbo` - Latest high-speed model
+
+### Music Generation
+- `music-2.5` - Latest music generation model
+
+## Common Voice IDs
+
+- `male-qn-qingse` - Male-Youth-Innocent
+- `female-shaonv` - Female-Young
+- `tianxin_xiaoling` - Female-Sweet Ling
+- `audiobook_male_1` - Audiobook Male
+- `Chinese (Mandarin)_News_Anchor` - News Anchor
+
+Full list available via `voice_manager.list_voices()`.
+
+## Error Codes
+
+- `0` - Success
+- `1000` - Unknown error
+- `1001` - Timeout
+- `1002` - Rate limit triggered
+- `1004` - Authentication failed
+- `1008` - Insufficient balance
+- `2013` - Parameter error
+- `2038` - No cloning permission
diff --git a/skills/minimax-voice/scripts/music_generation.py b/skills/minimax-voice/scripts/music_generation.py
@@ -0,0 +1,279 @@
+#!/usr/bin/env python3
+"""
+MiniMax Music Generation API Client
+Supports generating music from lyrics and style descriptions
+API: POST /v1/music_generation
+"""
+
+import os
+import json
+import base64
+import requests
+from typing import Optional, Dict, Any
+from pathlib import Path
+
+
+def _get_default_output_dir() -> Path:
+    """Get default audio output directory"""
+    return Path.cwd() / "assets" / "audios"
+
+
+class MiniMaxMusicGenerator:
+    """MiniMax Music Generation Client"""
+
+    BASE_URL = "https://api.minimaxi.com/v1/music_generation"
+
+    # Supported models
+    MODELS = ["music-2.5"]
+
+    # Supported audio formats
+    FORMATS = ["mp3", "wav", "pcm"]
+
+    # Supported sample rates
+    SAMPLE_RATES = [16000, 24000, 32000, 44100]
+
+    # Supported bitrates
+    BITRATES = [32000, 64000, 128000, 256000]
+
+    def __init__(self, api_key: Optional[str] = None, group_id: Optional[str] = None):
+        """
+        Initialize music generation client
+
+        Args:
+            api_key: MiniMax API Key
+            group_id: MiniMax Group ID
+        """
+        raw_key = api_key or os.getenv("MINIMAX_API_KEY")
+        self.group_id = group_id or os.getenv("MINIMAX_GROUP_ID")
+
+        if not raw_key:
+            raise ValueError(
+                "API key is required.\n"
+                "Please set MINIMAX_API_KEY environment variable:\n"
+                "  export MINIMAX_API_KEY='Bearer sk-api-xxxxx'\n"
+                "Or pass api_key parameter to MiniMaxMusicGenerator()."
+            )
+
+        # Auto-add Bearer prefix if not present
+        self.api_key = raw_key if raw_key.startswith("Bearer ") else f"Bearer {raw_key}"
+
+    def _get_headers(self) -> Dict[str, str]:
+        """Get request headers"""
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": self.api_key
+        }
+        if self.group_id:
+            headers["X-Minimax-Group-Id"] = self.group_id
+        return headers
+
+    def generate(
+        self,
+        lyrics: str,
+        prompt: Optional[str] = None,
+        model: str = "music-2.5",
+        stream: bool = False,
+        output_format: str = "hex",
+        sample_rate: int = 44100,
+        bitrate: int = 256000,
+        format: str = "mp3",
+        aigc_watermark: bool = False,
+    ) -> Dict[str, Any]:
+        """
+        Generate music
+
+        Args:
+            lyrics: Lyrics content, use \\n to separate lines, supports [Verse], [Chorus] structure tags
+            prompt: Music style description (optional for music-2.5, required for other models)
+            model: Model version, default music-2.5
+            stream: Stream transmission, default False
+            output_format: Output format, hex or url, default hex
+            sample_rate: Sample rate, default 44100
+            bitrate: Bitrate, default 256000
+            format: Audio format, default mp3
+            aigc_watermark: Add watermark, default False
+
+        Returns:
+            Dictionary containing audio data and metadata
+        """
+        if model not in self.MODELS:
+            raise ValueError(f"Unsupported model: {model}. Choose from {self.MODELS}")
+
+        if len(lyrics) < 1 or len(lyrics) > 3500:
+            raise ValueError("Lyrics length must be between 1 and 3500 characters")
+
+        if prompt and len(prompt) > 2000:
+            raise ValueError("Prompt length must be <= 2000 characters")
+
+        if stream and output_format != "hex":
+            raise ValueError("Streaming mode only supports hex output format")
+
+        payload: Dict[str, Any] = {
+            "model": model,
+            "lyrics": lyrics,
+            "stream": stream,
+            "output_format": output_format,
+            "audio_setting": {
+                "sample_rate": sample_rate,
+                "bitrate": bitrate,
+                "format": format,
+            },
+            "aigc_watermark": aigc_watermark,
+        }
+
+        if prompt:
+            payload["prompt"] = prompt
+
+        response = requests.post(
+            self.BASE_URL,
+            headers=self._get_headers(),
+            json=payload
+        )
+        response.raise_for_status()
+
+        result = response.json()
+
+        if result.get("base_resp", {}).get("status_code") != 0:
+            raise APIError(
+                f"API Error: {result['base_resp']['status_msg']} "
+                f"(code: {result['base_resp']['status_code']})"
+            )
+
+        return result
+
+    def save_audio(
+        self,
+        result: Dict[str, Any],
+        filename: Optional[str] = None,
+        output_dir: Optional[str] = None
+    ) -> str:
+        """
+        Save generated music to file
+
+        Args:
+            result: API response dictionary
+            filename: Filename (without path), default uses music_{timestamp}.mp3
+            output_dir: Output directory, default ./assets/audios
+
+        Returns:
+            Full path of saved file
+        """
+        if "data" not in result or "audio" not in result["data"]:
+            raise ValueError("Invalid result: missing audio data")
+
+        # Determine output directory
+        if output_dir is None:
+            output_dir = _get_default_output_dir()
+        else:
+            output_dir = Path(output_dir)
+
+        # Ensure directory exists
+        output_dir.mkdir(parents=True, exist_ok=True)
+
+        # Determine filename
+        if filename is None:
+            import time
+            ext = result.get("extra_info", {}).get("audio_format", "mp3")
+            filename = f"music_{int(time.time())}.{ext}"
+
+        output_path = output_dir / filename
+
+        audio_hex = result["data"]["audio"]
+        audio_bytes = bytes.fromhex(audio_hex)
+
+        with open(output_path, "wb") as f:
+            f.write(audio_bytes)
+
+        extra_info = result.get("extra_info", {})
+        print(f"Music saved to: {output_path}")
+        print(f"  Duration: {extra_info.get('music_duration', 'N/A')} ms")
+        print(f"  Sample rate: {extra_info.get('music_sample_rate', 'N/A')} Hz")
+        print(f"  Size: {extra_info.get('music_size', 'N/A')} bytes")
+        return str(output_path)
+
+    def generate_with_structure(
+        self,
+        verses: list[str],
+        choruses: list[str],
+        prompt: str,
+        bridge: Optional[str] = None,
+        outro: Optional[str] = None,
+        **kwargs
+    ) -> Dict[str, Any]:
+        """
+        Generate music using structured lyrics
+
+        Args:
+            verses: List of verse lyrics
+            choruses: List of chorus lyrics
+            prompt: Music style description
+            bridge: Bridge lyrics (optional)
+            outro: Outro lyrics (optional)
+            **kwargs: Other generate parameters
+
+        Returns:
+            API response result
+        """
+        lyrics_parts = []
+
+        # Build structured lyrics
+        for i, verse in enumerate(verses):
+            lyrics_parts.append(f"[Verse {i+1}]")
+            lyrics_parts.append(verse)
+
+        for i, chorus in enumerate(choruses):
+            lyrics_parts.append(f"[Chorus {i+1}]")
+            lyrics_parts.append(chorus)
+
+        if bridge:
+            lyrics_parts.append("[Bridge]")
+            lyrics_parts.append(bridge)
+
+        if outro:
+            lyrics_parts.append("[Outro]")
+            lyrics_parts.append(outro)
+
+        lyrics = "\n".join(lyrics_parts)
+
+        return self.generate(lyrics=lyrics, prompt=prompt, **kwargs)
+
+
+class APIError(Exception):
+    """API Error Exception"""
+    pass
+
+
+def main():
+    """Command-line usage example"""
+    import argparse
+
+    parser = argparse.ArgumentParser(description="MiniMax Music Generation")
+    parser.add_argument("--lyrics", "-l", required=True, help="Lyrics file path or text")
+    parser.add_argument("--prompt", "-p", help="Music style prompt")
+    parser.add_argument("--model", "-m", default="music-2.5", help="Model name")
+    parser.add_argument("--output", "-o", default="music.mp3", help="Output file")
+
+    args = parser.parse_args()
+
+    # Read lyrics
+    if os.path.isfile(args.lyrics):
+        with open(args.lyrics, "r", encoding="utf-8") as f:
+            lyrics = f.read()
+    else:
+        lyrics = args.lyrics
+
+    generator = MiniMaxMusicGenerator()
+
+    print("Generating music...")
+    result = generator.generate(
+        lyrics=lyrics,
+        prompt=args.prompt,
+        model=args.model
+    )
+
+    generator.save_audio(result, args.output)
+    print("Done!")
+
+
+if __name__ == "__main__":
+    main()