DeepSeek
Reasoning-first LLMs via RunAPI — flash for fast, low-cost work; pro for complex agentic tasks.
# Works with Claude Code, Codex, Gemini CLI, Cursor, and 50+ agents
npx skills add runapi-ai/deepseek -g
Install the DeepSeek skill for me: 1. Clone https://github.com/runapi-ai/deepseek 2. Copy the skills/deepseek/ directory into your user-level skills directory (e.g. ~/.claude/skills/ for Claude Code, ~/.codex/skills/ for Codex). 3. Verify that SKILL.md is present. 4. Confirm the install path when done.
DeepSeek is a family of reasoning-first language models. deepseek-v4-flash is the fast, low-cost tier with an optional thinking mode; deepseek-v4-pro targets the hardest reasoning and agentic workloads. Both are available through RunAPI with one key and per-token billing.
- Multiple variants for different speed / quality tiers
- Single endpoint per action — text-to-X, image-to-X, etc.
- Streaming and async patterns supported per variant
- Failed generations are not charged
Compare all variants
From prompt to tool call
# User prompt to the agent
"Refactor this Python module for readability, explain each change, then add unit tests for the edge cases."
// Code generated by the agent via @runapi.ai/deepseek
import { DeepseekClient } from '@runapi.ai/deepseek';
const client = new DeepseekClient();
const result = await client.message.run({
model: 'deepseek-v4-flash',
prompt: 'Refactor this Python module for readability, explain each change, then add unit tests for the edge cases.',
});
From install to first result in four steps
Install
One CLI command adds the skill bundle to your agent.
Configure
Set the variant ID and your RunAPI key. That's the whole config.
Call
The agent emits a typed tool call. Schema-validated, no glue code.
Receive
Stream the result back into your loop. Cached, signed, ready.
Call directly from your code
curl -X POST https://runapi.ai/v1/messages \
-H "Authorization: Bearer $RUNAPI_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v4-flash",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": "Refactor this Python module for readability, explain each change, then add unit tests for the edge cases."
}
]
}'
import { DeepseekClient } from "@runapi.ai/deepseek";
const client = new DeepseekClient();
const result = await client.message.run({
model: "deepseek-v4-flash",
max_tokens: 1024,
messages: [{"role":"user","content":"Refactor this Python module for readability, explain each change, then add unit tests for the edge cases."}],
});
require "runapi/deepseek"
client = RunApi::Deepseek::Client.new
result = client.message.run(
model: "deepseek-v4-flash",
max_tokens: 1024,
messages: [{role: "user", content: "Refactor this Python module for readability, explain each change, then add unit tests for the edge cases."}]
)
What is DeepSeek?
DeepSeek models are large language models tuned for reasoning, code, and agentic tool use. Through RunAPI they share a single API key with pay-as-you-go token billing, and are callable from both the OpenAI and Anthropic SDKs.
Why route DeepSeek through RunAPI
One auth, every provider
A single RunAPI key unlocks the whole catalog. No separate accounts, no key rotation per integration.
Unified pricing & billing
Per-call pricing in USD, billed monthly. Failed generations are not charged.
Schema-first SDK
Typed JSON schema across every variant. Tool calls validate before the wire.
Common questions
Which variant should I start with?
Pick the cheapest variant that meets your quality bar. Most teams start on the fast variant and graduate to pro for production.
Is there a free tier?
New accounts get free first calls on every model. After that, pay per call.
Do you stream results?
Where streaming is available, RunAPI streams end-to-end.
How are failures billed?
Failed generations are not charged.
Are outputs cached?
Generated outputs are stored and retrievable by task ID. Inputs are not cached.
Can I use commercially?
Yes — commercial use is included for every variant unless a model license explicitly restricts it, which is called out on the variant page.
What about rate limits?
Per-key rate limits scale with usage tier. See pricing page for current limits.
Where can I report issues?
Open an issue on the public GitHub repo or email support.