DeepSeek: DeepSeek V4 Flash
deepseek/deepseek-v4-flashAbout
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It shares the V4 architecture with DeepSeek V4 Pro, including a hybrid attention mechanism (Compressed Sparse Attention and Heavily Compressed Attention) for efficient long-context processing and configurable reasoning modes. The design targets fast inference and high-throughput workloads while maintaining reasoning and coding performance, making it suitable for coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency matter.
Capabilities
- Context Length
- 1.0M
- Max Output
- 384K
- Reasoning
- Yes
- In
- text
- Out
- text
Benchmarks
View leaderboardReasoning & Knowledge
Coding & Agentic
Source: Artificial Analysis
Pricing
Full pricing| Type | Price / 1M tokens |
|---|---|
| Input | $0.0983 |
| Output | $0.1966 |
| Cache Read | $0.0197 |
OpenAI-compatible · Model ID deepseek/deepseek-v4-flash
curl https://api.elliotgate.com/v1/chat/completions \
-H "Authorization: Bearer sk-omg-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek/deepseek-v4-flash",
"messages": [{"role": "user", "content": "Hello!"}]
}'OFTEN COMPARED
DeepSeek: DeepSeek V4 Flash comparisons
Decide which model wins on the dimensions that matter for your workload — context, benchmarks, pricing, or serving latency.
DeepSeek: DeepSeek V4 Flash vs Claude Opus 4.6
Claude Opus 4.
See full comparison →DeepSeek: DeepSeek V4 Flash vs Claude Opus 4.6 (Fast)
Claude Opus 4.
See full comparison →DeepSeek: DeepSeek V4 Flash vs GLM 5 Turbo
DeepSeek V4 Flash and GLM-5 Turbo compete directly as reasoning-capable text models aimed at agent workloads, but their positioning diverges on context, price, and benchmark profile.
See full comparison →DeepSeek: DeepSeek V4 Flash vs Kimi K2.5
DeepSeek V4 Flash and Kimi K2.
See full comparison →DeepSeek: DeepSeek V4 Flash vs Qwen3.6 27B
DeepSeek V4 Flash and Qwen3.
See full comparison →DeepSeek: DeepSeek V4 Flash vs GPT-5.1 Chat
DeepSeek V4 Flash and GPT-5.
See full comparison →DeepSeek: DeepSeek V4 Flash vs GPT-5.1
DeepSeek V4 Flash and GPT-5.
See full comparison →DeepSeek: DeepSeek V4 Flash vs Qwen3.5 397B A17B
DeepSeek V4 Flash and Qwen3.
See full comparison →