Overview
A lightweight Agent framework with native-level DeepSeek adaptation — Precise tool calling in thinking mode, reliable structured output, maximum cache hit rate.
What is deepseek-kit?
deepseek-kit is a TypeScript Agent framework purpose-built for the DeepSeek API. Unlike general-purpose frameworks, deepseek-kit is optimized from the ground up for DeepSeek's core features — thinking mode, context caching, and structured output — ensuring that every DeepSeek-specific capability is used correctly and efficiently.
Why deepseek-kit?
LangChain.js and AI SDK are both excellent general-purpose frameworks, but the DeepSeek API has unique thinking mode and caching mechanisms that general-purpose frameworks struggle to accommodate. deepseek-kit solves these problems at their root.
Tool Calling in Thinking Mode — Precision Adaptation
DeepSeek's thinking mode outputs a chain of thought (reasoning_content) before the final response. When the model initiates a tool call during the thinking process, all subsequent requests must include the full reasoning_content, otherwise the API returns a 400 error.
General-purpose frameworks' message management mechanisms cannot distinguish between the different handling requirements for reasoning_content in "with tool call" and "without tool call" scenarios, leading to frequent multi-turn tool calling failures.
deepseek-kit's solution:
- Automatic Chain-of-Thought Tracking: Automatically preserves and re-sends
reasoning_contentin the agent loop, ensuring the thinking chain is never broken - Differentiated Handling Strategy: Omits
reasoning_contentwhen there are no tool calls to reduce token consumption, and fully re-sends it when tool calls are present to ensure correctness - Thinking Mode Enabled by Default:
createModelenables thinking mode by default withreasoningEffort: 'high', providing optimal reasoning quality without additional configuration
DeepSeek Thinking Mode Documentation
Higher Cache Hit Rates — Lower Costs
The DeepSeek API enables context hard disk caching by default. When subsequent requests have a prefix that exactly matches a previous request, the repeated portion only needs to be fetched from cache, significantly reducing latency and cost.
However, cache hits depend on strict consistency of the request prefix. General-purpose frameworks often inject dynamic metadata like timestamps and request IDs, or arrange messages in non-deterministic order — these behaviors break prefix consistency and cause cache hit rates to plummet.
deepseek-kit's solution:
- Zero-Redundancy Request Body: Only sends the fields required by the API, never injecting extra metadata, maximizing prefix matching probability
- Deterministic Message Construction: Messages are assembled in a fixed order, ensuring the same input always produces the same request prefix
- Observable Cache Hit Rates: The
Usagetype fully exposesprompt_cache_hit_tokensandprompt_cache_miss_tokens, giving you real-time visibility into cache efficiency
DeepSeek Context Hard Disk Cache Documentation
Better Structured Output — Thinking Mode Compatibility
Structured output is a high-frequency requirement for agent applications, but under DeepSeek's thinking mode, general-purpose frameworks' structured output solutions often conflict with reasoning_content management, resulting in unreliable output formats.
deepseek-kit's solution:
- Zod Schema-Driven: Define output structures using Zod, gaining full TypeScript type inference
- Smart Retry with Error Feedback: When output doesn't conform to the schema, formatted validation errors are automatically fed back to the model for correction, with up to 3 retries
- Thinking Mode Compatible: The structured output flow fully preserves
reasoning_content, never losing the chain-of-thought context during formatting steps
Core Features
- 🧠 Thinking Mode Adaptation — Automatic
reasoning_contentmanagement, zero-configuration tool calling chains - 💾 Cache Hit Rate Optimization — Zero-redundancy request body + deterministic message construction, maximizing DeepSeek cache benefits
- 📋 Structured Output — Zod Schema-driven, smart retry, fully compatible with thinking mode
- 🤖 Agent System — Build intelligent agents with tool calling capabilities
- 💬 Streaming — Streaming events for text, chain-of-thought, and tool calls
- 🔧 Tool Calling — Built-in tool definition, parameter validation, timeout, and retry
- ✍️ FIM Completion — Fill-in-the-Middle code completion support
- 🪝 Hook System — Insert custom logic before and after generation steps
- 🔄 Auto Retry — Smart retry strategy with exponential backoff and jitter
- 🌲 Tree-shakable — Pure ESM,
sideEffects: false - 🔒 Type Safe — Complete TypeScript type definitions
Requirements
- Node.js >= 18.0.0
- DeepSeek API key
Quick Experience
Try it online — no local setup needed:
Or scaffold a project locally:
Quick Example
Community
If you have any questions about deepseek-kit, feel free to ask on GitHub Issues.

