speech

Generate speech audio from text using AI text-to-speech models. Supports OpenAI (tts-1, tts-1-hd, gpt-4o-mini-tts), ElevenLabs (eleven_v3, eleven_multilingual_v2, eleven_flash_v2_5), and Cartesia (sonic-3.5, sonic-3). Uses your provider API key directly (egaki subscription does not cover speech yet).

Usage

1egaki speech [text]

Arguments

Argument	Required	Description
`[text]`	No	text

Options

Option	Default	Description
`-m, --model [model]`	-	Speech model ID. If omitted, shows an interactive picker (or uses default in non-TTY mode)
`-o, --output [path]`	`egaki-output`	Output file path (extension added from audio format if omitted)
`--voice [voice]`	-	Voice ID or name (provider-specific). OpenAI: alloy, echo, nova, etc. ElevenLabs/Cartesia: voice ID from their library
`--output-format [format]`	-	Audio output format: mp3, wav, pcm, opus, aac, flac (provider support varies)
`--speed [speed]`	-	Playback speed multiplier (OpenAI: 0.25-4.0, Cartesia: 0.6-1.5)
`--instructions [text]`	-	Style instructions for speech (only gpt-4o-mini-tts and ElevenLabs). E.g. "Speak in a calm, soothing tone"
`--language [lang]`	-	ISO 639-1 language code (e.g. en, es, fr). ElevenLabs and Cartesia
`--stdin`	-	Read text from stdin instead of the positional argument
`--json`	-	Output result metadata as JSON to stdout (model, file path, cost)
`--stdout`	-	Write raw audio bytes to stdout instead of saving to a file. Useful for piping

Global Options

Option	Default	Description
`-h, --help`	-	Display this message
`-v, --version`	-	Display version number

Examples

1# Generate speech with default model

1egaki speech "Hello, welcome to egaki!"

1# Use a specific model and voice

1egaki speech "Good morning" -m tts-1-hd --voice nova

1# Use GPT-4o Mini TTS with style instructions

1egaki speech "Breaking news today" -m gpt-4o-mini-tts --instructions "Speak like a news anchor"

1# ElevenLabs with a voice ID

1egaki speech "Hello world" -m eleven_v3 --voice 21m00Tcm4TlvDq8ikWAM

1# Read text from stdin

1cat script.txt | egaki speech --stdin -o narration.mp3

1# Cartesia Sonic 3.5 with a voice ID

1egaki speech "Hello world" -m sonic-3.5 --voice f786b574-daa5-4673-aa0c-cbe3e8534c02

1# Pipe audio to another tool

1egaki speech "test" --stdout | ffplay -nodisp -autoexit -

Ask AI about this page