Can I use ElevenLabs in Hermes Agent?

Yes. Configure RunAPI as a custom:runapi provider in Hermes Agent with base_url https://runapi.ai/v1 and key_env RUNAPI_API_KEY, then call any ElevenLabs endpoint -- text_to_speech, speech_to_text, text_to_dialogue, text_to_sound, or isolate_audio.

What stability and similarity settings produce the most natural voice?

Start with stability at 0.5 and similarity_boost at 0.75. Higher stability makes the voice more consistent but less expressive. Higher similarity keeps the voice closer to the original profile. For audiobooks, try stability 0.6-0.8. For conversational content, lower stability (0.3-0.5) adds natural variation.

How do I reduce ElevenLabs costs for long-form content like audiobooks?

Use turbo-v2.5 for English content -- it costs roughly half as much per character as multilingual-v2. Break long texts into chunks under 5000 characters per request. Use the RunAPI batch approach to process chapters in parallel rather than sequentially.

Can I transcribe audio with ElevenLabs in Hermes Agent?

Yes. Call the speech_to_text endpoint at /api/v1/elevenlabs/speech_to_text with a source_audio_url. The endpoint supports optional speaker diarization via the diarize parameter and audio event tagging via tag_audio_events. Results are returned asynchronously.

How does audio isolation work through RunAPI?

Call the isolate_audio endpoint at /api/v1/elevenlabs/isolate_audio with a source_audio_url pointing to your mixed audio file. The endpoint extracts vocals from background noise and returns a cleaned audio URL. The task is async -- poll or use a callback_url.

Can Hermes Agent chain ElevenLabs TTS with video generation in one workflow?

Yes. Hermes Agent can generate speech with ElevenLabs, then pass the audio URL to InfiniteTalk for avatar video or to Wan for speech-to-video, creating a complete text-to-spoken-video pipeline in one run.

HERMES + ELEVENLABS

Use ElevenLabs in Hermes Agent.

ElevenLabs provides six audio endpoints through RunAPI — turbo-v2.5 TTS with sub-second latency, multilingual-v2 covering 29 languages, dialogue-v3 for multi-speaker conversations, sound effects, speech-to-text transcription, and vocal isolation. Hermes Agent calls them through the custom:runapi provider with one API key.

Get API Key Read the docs

one API key · text to speech endpoint · per-character billing

Use RunAPI to generate speech audio with ElevenLabs text-to-speech.

Requirements:
- Read the API key from RUNAPI_API_KEY.
- Use the custom:runapi provider with base_url https://runapi.ai/v1.
- Call POST https://runapi.ai/api/v1/elevenlabs/text_to_speech
- Set model to "text-to-speech-turbo-v2.5".
- Set text to the content you want spoken.
- Optionally set voice to a specific ElevenLabs voice ID.
- Optionally set speed between 0.7 and 1.2.
- The task is async. Poll the returned task_id until status is "completed".
- When done, read the audio URL from the response output.

curl -X POST https://runapi.ai/api/v1/elevenlabs/text_to_speech \
  -H "Authorization: Bearer $RUNAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-to-speech-turbo-v2.5",
    "text": "Welcome to RunAPI. This audio was generated by ElevenLabs turbo v2.5.",
    "speed": 1.0,
    "stability": 0.5,
    "similarity_boost": 0.75
  }'

{
  "task_id": "tsk_abc123",
  "status": "pending",
  "model": "text-to-speech-turbo-v2.5"
}

Copy the curl command to test elevenlabs

HOW IT WORKS

Use ElevenLabs in Hermes Agent in three steps

Configure RunAPI

Set RUNAPI_API_KEY in the environment where Hermes Agent runs. If you already added RunAPI as a custom:runapi provider, the same key and base_url handle all ElevenLabs endpoints — TTS, STT, dialogue, sound effects, and audio isolation.

export RUNAPI_API_KEY=runapi_xxx

Call text_to_speech

Send a POST to the text_to_speech endpoint with model set to text-to-speech-turbo-v2.5, the text you want spoken, and optional voice, speed, and stability parameters. Hermes Agent routes the request through the custom:runapi provider. For multilingual output, use text-to-speech-multilingual-v2 with a voice and language_code.

POST /api/v1/elevenlabs/text_to_speech

Poll for the result

The endpoint returns a task_id immediately. Poll the task status endpoint until the status is completed, then read the output audio URL from the response.

GET /api/v1/elevenlabs/text_to_speech/tsk_abc123

PARAMETERS

ElevenLabs text_to_speech API parameters

Parameter	Type	Description
`model`	`string`	Required. text-to-speech-turbo-v2.5 (low latency) or text-to-speech-multilingual-v2 (29 languages).
`text`	`string`	Required. The text to convert to speech. Max 5000 characters.
`voice`	`string`	ElevenLabs voice ID. Required for multilingual-v2. Turbo-v2.5 uses a default voice if omitted.
`speed`	`float`	Optional. Playback speed multiplier. Range 0.7 to 1.2.
`stability`	`float`	Optional. Voice consistency. Range 0.0 to 1.0. Lower values add expressiveness.
`similarity_boost`	`float`	Optional. Voice similarity enforcement. Range 0.0 to 1.0.
`style`	`float`	Optional. Style exaggeration. Range 0.0 to 1.0.
`language_code`	`string`	Optional. Target language for multilingual-v2, e.g. en, es, ja.
`callback_url`	`string`	Optional. Webhook URL that receives a POST when the task completes.

What is ElevenLabs on Hermes Agent?

ElevenLabs is the leading text-to-speech API, and Hermes Agent calls it through the custom:runapi provider for voice generation, transcription, and audio processing. The key advantage in Hermes is chaining -- generate speech, then pass the audio URL to InfiniteTalk for a talking avatar or to a video model for complete audiovisual content, all in one agent run. Six endpoints are available including turbo TTS, multilingual voices, multi-speaker dialogue, and sound effects.

ElevenLabs use cases

Conversational AI voice agents

Build voice agents that speak naturally by generating speech through turbo-v2.5 with sub-second latency, suitable for customer service bots, interactive assistants, or phone-based interfaces.

YouTube content narration

Produce voiceover for YouTube videos in consistent character voices, adjusting stability for narrator consistency and style exaggeration for emotional range across an entire series.

Text-to-spoken-video pipelines

Chain ElevenLabs TTS with InfiniteTalk or other video models in a Hermes Agent workflow to go from text to narrated video with a talking avatar in a single automated run.

FAQ

ElevenLabs + Hermes Agent questions

Hermes Agent general setup

Not configured yet? Start with the RunAPI setup guide for Hermes Agent.

Hermes Agent setup guide →

ElevenLabs model catalog

See all ElevenLabs variants, pricing, and API docs.

ElevenLabs on RunAPI →

Try ElevenLabs in Hermes Agent today.

Get a free RunAPI key, configure the custom:runapi provider, and generate speech audio with ElevenLabs — six endpoints, one API key, per-character billing.

Browse models →