MiniMax CLI — Agent Skill Guide
Use
to generate text, images, video, speech, music, and perform web search via the MiniMax AI platform.
Prerequisites
bash
# Install
npm install -g mmx-cli
# Auth (persisted to ~/.mmx/credentials.json)
mmx auth login --api-key sk-xxxxx
# Or pass per-call
mmx text chat --api-key sk-xxxxx --message "Hello"
Region is auto-detected. Override with
or
.
Agent Flags
Always use these flags in non-interactive (agent/CI) contexts:
| Flag | Purpose |
|---|
| Fail fast on missing args instead of prompting |
| Suppress spinners/progress; stdout is pure data |
| Machine-readable JSON output |
| Return task ID immediately (video generation) |
| Preview the API request without executing |
| Skip confirmation prompts |
Commands
text chat
Chat completion. Default model:
.
bash
mmx text chat --message <text> [flags]
| Flag | Type | Description |
|---|
| string, required, repeatable | Message text. Prefix with to set role (e.g. , ) |
| string | JSON file with messages array. Use for stdin |
| string | System prompt |
| string | Model ID (default: ) |
| number | Max tokens (default: 4096) |
| number | Sampling temperature (0.0, 1.0] |
| number | Nucleus sampling threshold |
| boolean | Stream tokens (default: on in TTY) |
| string, repeatable | Tool definition JSON or file path |
bash
# Single message
mmx text chat --message "user:What is MiniMax?" --output json --quiet
# Multi-turn
mmx text chat \
--system "You are a coding assistant." \
--message "user:Write fizzbuzz in Python" \
--output json
# From file
cat conversation.json | mmx text chat --messages-file - --output json
stdout: response text (text mode) or full response object (json mode).
image generate
Generate images. Model:
.
bash
mmx image generate --prompt <text> [flags]
| Flag | Type | Description |
|---|
| string, required | Image description |
| string | e.g. , |
| number | Number of images (default: 1) |
| string | Subject reference: type=character,image=path-or-url
|
| string | Download images to directory |
| string | Filename prefix (default: ) |
bash
mmx image generate --prompt "A cat in a spacesuit" --output json --quiet
# stdout: image URLs (one per line in quiet mode)
mmx image generate --prompt "Logo" --n 3 --out-dir ./gen/ --quiet
# stdout: saved file paths (one per line)
video generate
Generate video. Default model:
. This is an async task — by default it polls until completion.
bash
mmx video generate --prompt <text> [flags]
| Flag | Type | Description |
|---|
| string, required | Video description |
| string | (default) or |
--first-frame <path-or-url>
| string | First frame image |
| string | Webhook URL for completion |
| string | Save video to specific file |
| boolean | Return task ID immediately |
| boolean | Same as |
--poll-interval <seconds>
| number | Polling interval (default: 5) |
bash
# Non-blocking: get task ID
mmx video generate --prompt "A robot." --async --quiet
# stdout: {"taskId":"..."}
# Blocking: wait and get file path
mmx video generate --prompt "Ocean waves." --download ocean.mp4 --quiet
# stdout: ocean.mp4
video task get
Query status of a video generation task.
bash
mmx video task get --task-id <id> [--output json]
video download
Download a completed video by task ID.
bash
mmx video download --file-id <id> [--out <path>]
speech synthesize
Text-to-speech. Default model:
. Max 10k chars.
bash
mmx speech synthesize --text <text> [flags]
| Flag | Type | Description |
|---|
| string | Text to synthesize |
| string | Read text from file. Use for stdin |
| string | (default), , |
| string | Voice ID (default: English_expressive_narrator
) |
| number | Speed multiplier |
| number | Volume level |
| number | Pitch adjustment |
| string | Audio format (default: ) |
| number | Sample rate (default: 32000) |
| number | Bitrate (default: 128000) |
| number | Audio channels (default: 1) |
| string | Language boost |
| boolean | Include subtitle timing data |
--pronunciation <from/to>
| string, repeatable | Custom pronunciation |
| string | Add sound effect |
| string | Save audio to file |
| boolean | Stream raw audio to stdout |
bash
mmx speech synthesize --text "Hello world" --out hello.mp3 --quiet
# stdout: hello.mp3
echo "Breaking news." | mmx speech synthesize --text-file - --out news.mp3
music generate
Generate music. Model:
. Responds well to rich, structured descriptions.
bash
mmx music generate --prompt <text> [--lyrics <text>] [flags]
| Flag | Type | Description |
|---|
| string | Music style description (can be detailed) |
| string | Song lyrics with structure tags. Use for instrumental. Cannot be used with |
| string | Read lyrics from file. Use for stdin |
| string | Vocal style, e.g. , , |
| string | Music genre, e.g. folk, pop, jazz |
| string | Mood or emotion, e.g. warm, melancholic, uplifting |
| string | Instruments to feature, e.g. |
| string | Tempo description, e.g. fast, slow, moderate |
| number | Exact tempo in beats per minute |
| string | Musical key, e.g. C major, A minor, G sharp |
| string | Elements to avoid in the generated music |
| string | Use case context, e.g. "background music for video"
, |
| string | Song structure, e.g. "verse-chorus-verse-bridge-chorus"
|
| string | Reference tracks or artists, e.g. |
| string | Additional fine-grained requirements |
| boolean | Generate instrumental music (no vocals). Cannot be used with or |
| boolean | Embed AI-generated content watermark |
| string | Audio format (default: ) |
| number | Sample rate (default: 44100) |
| number | Bitrate (default: 256000) |
| string | Save audio to file |
| boolean | Stream raw audio to stdout |
At least one of
or
is required.
bash
# Simple usage
mmx music generate --prompt "Upbeat pop" --lyrics "La la la..." --out song.mp3 --quiet
# Detailed prompt with vocal characteristics
mmx music generate --prompt "Warm morning folk" \
--vocals "male and female duet, harmonies in chorus" \
--instruments "acoustic guitar, piano" \
--bpm 95 \
--lyrics-file song.txt \
--out duet.mp3
# Instrumental (use --instrumental flag)
mmx music generate --prompt "Cinematic orchestral, building tension" --instrumental --out bgm.mp3
vision describe
Image understanding via VLM. Provide either
or
, not both.
bash
mmx vision describe (--image <path-or-url> | --file-id <id>) [flags]
| Flag | Type | Description |
|---|
| string | Local path or URL (auto base64-encoded) |
| string | Pre-uploaded file ID (skips base64) |
| string | Question about the image (default: ) |
bash
mmx vision describe --image photo.jpg --prompt "What breed?" --output json
stdout: description text (text mode) or full response (json mode).
search query
Web search via MiniMax.
bash
mmx search query --q <query>
| Flag | Type | Description |
|---|
| string, required | Search query |
bash
mmx search query --q "MiniMax AI" --output json --quiet
quota show
Display Token Plan usage and remaining quotas.
bash
mmx quota show [--output json]
Tool Schema Export
Export all commands as Anthropic/OpenAI-compatible JSON tool schemas:
bash
# All tool-worthy commands (excludes auth/config/update)
mmx config export-schema
# Single command
mmx config export-schema --command "video generate"
Use this to dynamically register mmx commands as tools in your agent framework.
Exit Codes
| Code | Meaning |
|---|
| 0 | Success |
| 1 | General error |
| 2 | Usage error (bad flags, missing args) |
| 3 | Authentication error |
| 4 | Quota exceeded |
| 5 | Timeout |
| 10 | Content filter triggered |
Piping Patterns
bash
# stdout is always clean data — safe to pipe
mmx text chat --message "Hi" --output json | jq '.content'
# stderr has progress/spinners — discard if needed
mmx video generate --prompt "Waves" 2>/dev/null
# Chain: generate image → describe it
URL=$(mmx image generate --prompt "A sunset" --quiet)
mmx vision describe --image "$URL" --quiet
# Async video workflow
TASK=$(mmx video generate --prompt "A robot" --async --quiet | jq -r '.taskId')
mmx video task get --task-id "$TASK" --output json
mmx video download --task-id "$TASK" --out robot.mp4
Configuration Precedence
CLI flags → environment variables →
→ defaults.
bash
# Persistent config
mmx config set --key region --value cn
mmx config show
# Environment
export MINIMAX_API_KEY=sk-xxxxx
export MINIMAX_REGION=cn