HERMES + ELEVENLABS

Use ElevenLabs in Hermes Agent.

ElevenLabs provides six audio endpoints through RunAPI — turbo-v2.5 TTS with sub-second latency, multilingual-v2 covering 29 languages, dialogue-v3 for multi-speaker conversations, sound effects, speech-to-text transcription, and vocal isolation. Hermes Agent calls them through the custom:runapi provider with one API key.

one API key · text to speech endpoint · per-character billing
Use RunAPI to generate speech audio with ElevenLabs text-to-speech.

Requirements:
- Read the API key from RUNAPI_API_KEY.
- Use the custom:runapi provider with base_url https://runapi.ai/v1.
- Call POST https://runapi.ai/api/v1/elevenlabs/text_to_speech
- Set model to "text-to-speech-turbo-v2.5".
- Set text to the content you want spoken.
- Optionally set voice to a specific ElevenLabs voice ID.
- Optionally set speed between 0.7 and 1.2.
- The task is async. Poll the returned task_id until status is "completed".
- When done, read the audio URL from the response output.
curl -X POST https://runapi.ai/api/v1/elevenlabs/text_to_speech \
  -H "Authorization: Bearer $RUNAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-to-speech-turbo-v2.5",
    "text": "Welcome to RunAPI. This audio was generated by ElevenLabs turbo v2.5.",
    "speed": 1.0,
    "stability": 0.5,
    "similarity_boost": 0.75
  }'
{
  "task_id": "tsk_abc123",
  "status": "pending",
  "model": "text-to-speech-turbo-v2.5"
}
Copy the curl command to test elevenlabs
HOW IT WORKS

Use ElevenLabs in Hermes Agent in three steps

1

Configure RunAPI

Set RUNAPI_API_KEY in the environment where Hermes Agent runs. If you already added RunAPI as a custom:runapi provider, the same key and base_url handle all ElevenLabs endpoints — TTS, STT, dialogue, sound effects, and audio isolation.

export RUNAPI_API_KEY=runapi_xxx
2

Call text_to_speech

Send a POST to the text_to_speech endpoint with model set to text-to-speech-turbo-v2.5, the text you want spoken, and optional voice, speed, and stability parameters. Hermes Agent routes the request through the custom:runapi provider. For multilingual output, use text-to-speech-multilingual-v2 with a voice and language_code.

POST /api/v1/elevenlabs/text_to_speech
3

Poll for the result

The endpoint returns a task_id immediately. Poll the task status endpoint until the status is completed, then read the output audio URL from the response.

GET /api/v1/elevenlabs/text_to_speech/tsk_abc123
PARAMETERS

ElevenLabs text_to_speech API parameters

Parameter Type Description
model string Required. text-to-speech-turbo-v2.5 (low latency) or text-to-speech-multilingual-v2 (29 languages).
text string Required. The text to convert to speech. Max 5000 characters.
voice string ElevenLabs voice ID. Required for multilingual-v2. Turbo-v2.5 uses a default voice if omitted.
speed float Optional. Playback speed multiplier. Range 0.7 to 1.2.
stability float Optional. Voice consistency. Range 0.0 to 1.0. Lower values add expressiveness.
similarity_boost float Optional. Voice similarity enforcement. Range 0.0 to 1.0.
style float Optional. Style exaggeration. Range 0.0 to 1.0.
language_code string Optional. Target language for multilingual-v2, e.g. en, es, ja.
callback_url string Optional. Webhook URL that receives a POST when the task completes.

What is ElevenLabs on Hermes Agent?

ElevenLabs is the leading text-to-speech API, and Hermes Agent calls it through the custom:runapi provider for voice generation, transcription, and audio processing. The key advantage in Hermes is chaining -- generate speech, then pass the audio URL to InfiniteTalk for a talking avatar or to a video model for complete audiovisual content, all in one agent run. Six endpoints are available including turbo TTS, multilingual voices, multi-speaker dialogue, and sound effects.

ElevenLabs use cases

Conversational AI voice agents

Build voice agents that speak naturally by generating speech through turbo-v2.5 with sub-second latency, suitable for customer service bots, interactive assistants, or phone-based interfaces.

YouTube content narration

Produce voiceover for YouTube videos in consistent character voices, adjusting stability for narrator consistency and style exaggeration for emotional range across an entire series.

Text-to-spoken-video pipelines

Chain ElevenLabs TTS with InfiniteTalk or other video models in a Hermes Agent workflow to go from text to narrated video with a talking avatar in a single automated run.

FAQ

ElevenLabs + Hermes Agent questions

Hermes Agent general setup

Not configured yet? Start with the RunAPI setup guide for Hermes Agent.

Hermes Agent setup guide →

ElevenLabs model catalog

See all ElevenLabs variants, pricing, and API docs.

ElevenLabs on RunAPI →

Try ElevenLabs in Hermes Agent today.

Get a free RunAPI key, configure the custom:runapi provider, and generate speech audio with ElevenLabs — six endpoints, one API key, per-character billing.