voice clone

Clone a voice from an audio clip. Returns a voice ID for use with 'egaki speech --voice <id>'. Supports Cartesia (default) and ElevenLabs.

Best practices for high-quality clones:

Isolate vocals first: 'egaki demucs recording.mp3 --stems vocals' removes background music, noise, and other speakers.
Find a clean snippet: 'egaki transcribe recording-vocals.mp3' to get word timestamps. Pick a 5-10s segment with a complete phrase, clear speech, no hesitations or crosstalk.
Trim to speech boundaries: ffmpeg -i recording-vocals.mp3 -ss 12.5 -to 22.0 -c copy clip.mp3 No silence padding at start or end.
Match energy to intent: the clone mimics the tone and pacing of the source clip. Use an energetic clip for energetic output.
Speak in the target language. Use --language for Cartesia clones.

Cartesia: up to 10s of audio, instant, free. Good for short clean clips. ElevenLabs: 1-3 min recommended, has --remove-background-noise option.

Usage

1egaki voice clone [audio]

Argument	Required	Description
`[audio]`	No	audio

Option	Default	Description
`--name [name]`	-	Name for the cloned voice (required)
`-p, --provider [provider]`	`cartesia`	Voice cloning provider: cartesia or elevenlabs
`--language [lang]`	-	Cartesia only: ISO 639-1 language code (default: en). E.g. en, es, fr, de, ja
`--description [text]`	-	Optional description for the voice
`--base-voice-id [id]`	-	Cartesia: optional base voice ID to derive from
`--remove-background-noise`	-	ElevenLabs: apply AI noise removal to the clip before cloning
`--stdin`	-	Read audio from stdin instead of a file path
`--json`	-	Output result as JSON to stdout

1# Clone a voice from a recording

1egaki voice clone recording.wav --name "My Voice"

1# Clone with ElevenLabs and noise removal

1egaki voice clone vocals.mp3 --name "Narrator" --provider elevenlabs --remove-background-noise

1# Full pipeline: separate → trim → clone

1egaki demucs interview.mp3 --stems vocals

1ffmpeg -i interview-vocals.mp3 -ss 5.0 -to 15.0 -c copy clip.mp3

1egaki voice clone clip.mp3 --name "Speaker"

1# Use the cloned voice

1egaki speech "Hello world" --voice <voice-id>

Ask AI about this page