Can I use ElevenLabs in Hermes Agent?

Yes. Configure RunAPI as a custom:runapi provider in Hermes Agent with base_url https://runapi.ai/v1 and key_env RUNAPI_API_KEY, then call any ElevenLabs endpoint -- text_to_speech, speech_to_text, text_to_dialogue, text_to_sound, or isolate_audio.

What stability and similarity settings produce the most natural voice?

Start with stability at 0.5 and similarity_boost at 0.75. Higher stability makes the voice more consistent but less expressive. Higher similarity keeps the voice closer to the original profile. For audiobooks, try stability 0.6-0.8. For conversational content, lower stability (0.3-0.5) adds natural variation.

How do I reduce ElevenLabs costs for long-form content like audiobooks?

Use turbo-v2.5 for English content -- it costs roughly half as much per character as multilingual-v2. Break long texts into chunks under 5000 characters per request. Use the RunAPI batch approach to process chapters in parallel rather than sequentially.

Can I transcribe audio with ElevenLabs in Hermes Agent?

Yes. Call the speech_to_text endpoint at /api/v1/elevenlabs/speech_to_text with a source_audio_url. The endpoint supports optional speaker diarization via the diarize parameter and audio event tagging via tag_audio_events. Results are returned asynchronously.

How does audio isolation work through RunAPI?

Call the isolate_audio endpoint at /api/v1/elevenlabs/isolate_audio with a source_audio_url pointing to your mixed audio file. The endpoint extracts vocals from background noise and returns a cleaned audio URL. The task is async -- poll or use a callback_url.

Can Hermes Agent chain ElevenLabs TTS with video generation in one workflow?

Yes. Hermes Agent can generate speech with ElevenLabs, then pass the audio URL to InfiniteTalk for avatar video or to Wan for speech-to-video, creating a complete text-to-spoken-video pipeline in one run.

HERMES + ELEVENLABS

Hermes Agent で ElevenLabs を使う。

ElevenLabs は RunAPI 経由で6つの音声エンドポイントを提供します——サブ秒のレイテンシを持つ turbo-v2.5 TTS、29言語をカバーする multilingual-v2、複数話者の会話向け dialogue-v3、効果音、音声からの文字起こし、ボーカル分離です。Hermes Agent は custom:runapi プロバイダー経由で、1つの API キーでこれらを呼び出します。

API キーを取得ドキュメントを読む

1つの APIキー · テキスト読み上げエンドポイント · 文字単位の課金

RunAPI を使って ElevenLabs のテキスト読み上げで音声を生成します。

要件：
- RUNAPI_API_KEY から API キーを読み込みます。
- base_url を https://runapi.ai/v1 として custom:runapi プロバイダーを使用します。
- POST https://runapi.ai/api/v1/elevenlabs/text_to_speech を呼び出します。
- model を "text-to-speech-turbo-v2.5" に設定します。
- text に読み上げたい内容を設定します。
- 必要に応じて voice を特定の ElevenLabs voice ID に設定します。
- 必要に応じて speed を 0.7〜1.2 の間に設定します。
- このタスクは非同期です。返された task_id を status が "completed" になるまでポーリングします。
- 完了したら、レスポンスの output から音声の URL を読み取ります。

curl -X POST https://runapi.ai/api/v1/elevenlabs/text_to_speech \
  -H "Authorization: Bearer $RUNAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-to-speech-turbo-v2.5",
    "text": "Welcome to RunAPI. This audio was generated by ElevenLabs turbo v2.5.",
    "speed": 1.0,
    "stability": 0.5,
    "similarity_boost": 0.75
  }'

{
  "task_id": "tsk_abc123",
  "status": "pending",
  "model": "text-to-speech-turbo-v2.5"
}

curlコマンドをコピーしてテスト elevenlabs

仕組み

Hermes Agent で ElevenLabs を使う3ステップ

Configure RunAPI

Set RUNAPI_API_KEY in the environment where Hermes Agent runs. If you already added RunAPI as a custom:runapi provider, the same key and base_url handle all ElevenLabs endpoints — TTS, STT, dialogue, sound effects, and audio isolation.

export RUNAPI_API_KEY=runapi_xxx

Call text_to_speech

Send a POST to the text_to_speech endpoint with model set to text-to-speech-turbo-v2.5, the text you want spoken, and optional voice, speed, and stability parameters. Hermes Agent routes the request through the custom:runapi provider. For multilingual output, use text-to-speech-multilingual-v2 with a voice and language_code.

POST /api/v1/elevenlabs/text_to_speech

Poll for the result

The endpoint returns a task_id immediately. Poll the task status endpoint until the status is completed, then read the output audio URL from the response.

GET /api/v1/elevenlabs/text_to_speech/tsk_abc123

パラメータ

ElevenLabs text_to_speech API パラメータ

パラメータ	型	説明
`model`	`string`	Required. text-to-speech-turbo-v2.5 (low latency) or text-to-speech-multilingual-v2 (29 languages).
`text`	`string`	Required. The text to convert to speech. Max 5000 characters.
`voice`	`string`	ElevenLabs voice ID. Required for multilingual-v2. Turbo-v2.5 uses a default voice if omitted.
`speed`	`float`	Optional. Playback speed multiplier. Range 0.7 to 1.2.
`stability`	`float`	Optional. Voice consistency. Range 0.0 to 1.0. Lower values add expressiveness.
`similarity_boost`	`float`	Optional. Voice similarity enforcement. Range 0.0 to 1.0.
`style`	`float`	Optional. Style exaggeration. Range 0.0 to 1.0.
`language_code`	`string`	Optional. Target language for multilingual-v2, e.g. en, es, ja.
`callback_url`	`string`	Optional. Webhook URL that receives a POST when the task completes.

Hermes AgentのElevenLabsとは？

ElevenLabsは主要なテキスト→音声APIで、Hermes Agentはcustom:runapi provider経由で音声生成・文字起こし・音声処理に呼び出します。Hermesでの主な利点はチェーン化——音声を生成してからその音声URLをInfiniteTalkに渡してトーキングアバターを作る、または動画モデルに渡して完全な視聴覚コンテンツを作る、といった処理を1回のagent実行で完結できます。

ElevenLabsの活用例

会話型AI音声エージェント

turbo-v2.5のサブ秒レイテンシーで自然な音声を生成して話す音声agentを構築し、カスタマーサービスボット・インタラクティブアシスタント・電話ベースのインターフェースに活用します。

YouTubeコンテンツのナレーション

一貫したキャラクターボイスでYouTube動画のナレーションを制作し、シリーズ全体でStabilityを調整してナレーターの一貫性を保ち、感情範囲のためにStyle Exaggerationを調整します。

テキスト→トーキング動画パイプライン

Hermes AgentワークフローでElevenLabs TTSとInfiniteTalkまたは他の動画モデルを連結し、1回の自動化された実行でテキストからトーキングアバターナレーション動画への全フローを完結します。

FAQ

ElevenLabs + Hermes Agent に関する質問

Hermes Agent の基本設定

まだ設定していませんか？Hermes Agent の RunAPI セットアップガイドから始めましょう。

Hermes Agent セットアップガイド →

ElevenLabs モデルカタログ

すべての ElevenLabs バリアント、価格、API ドキュメントを確認する。

ElevenLabs モデル →

今すぐ Hermes Agent で ElevenLabs を試す。

無料の RunAPI キーを取得し、custom:runapi プロバイダーを設定して、ElevenLabs で音声を生成しましょう——6つのエンドポイント、1つの APIキー、文字単位の課金。

モデルを見る →