Overview

A lightweight Agent framework with native-level DeepSeek adaptation — Precise tool calling in thinking mode, reliable structured output, maximum cache hit rate.

What is deepseek-kit?

deepseek-kit is a TypeScript Agent framework purpose-built for the DeepSeek API. Unlike general-purpose frameworks, deepseek-kit is optimized from the ground up for DeepSeek's core features — thinking mode, context caching, and structured output — ensuring that every DeepSeek-specific capability is used correctly and efficiently.

Why deepseek-kit?

LangChain.js and AI SDK are both excellent general-purpose frameworks, but the DeepSeek API has unique thinking mode and caching mechanisms that general-purpose frameworks struggle to accommodate. deepseek-kit solves these problems at their root.

Tool Calling in Thinking Mode — Precision Adaptation

DeepSeek's thinking mode outputs a chain of thought (reasoning_content) before the final response. When the model initiates a tool call during the thinking process, all subsequent requests must include the full reasoning_content, otherwise the API returns a 400 error.

General-purpose frameworks' message management mechanisms cannot distinguish between the different handling requirements for reasoning_content in "with tool call" and "without tool call" scenarios, leading to frequent multi-turn tool calling failures.

deepseek-kit's solution:

  • Automatic Chain-of-Thought Tracking: Automatically preserves and re-sends reasoning_content in the agent loop, ensuring the thinking chain is never broken
  • Differentiated Handling Strategy: Omits reasoning_content when there are no tool calls to reduce token consumption, and fully re-sends it when tool calls are present to ensure correctness
  • Thinking Mode Enabled by Default: createModel enables thinking mode by default with reasoningEffort: 'high', providing optimal reasoning quality without additional configuration

Higher Cache Hit Rates — Lower Costs

The DeepSeek API enables context hard disk caching by default. When subsequent requests have a prefix that exactly matches a previous request, the repeated portion only needs to be fetched from cache, significantly reducing latency and cost.

However, cache hits depend on strict consistency of the request prefix. General-purpose frameworks often inject dynamic metadata like timestamps and request IDs, or arrange messages in non-deterministic order — these behaviors break prefix consistency and cause cache hit rates to plummet.

deepseek-kit's solution:

  • Zero-Redundancy Request Body: Only sends the fields required by the API, never injecting extra metadata, maximizing prefix matching probability
  • Deterministic Message Construction: Messages are assembled in a fixed order, ensuring the same input always produces the same request prefix
  • Observable Cache Hit Rates: The Usage type fully exposes prompt_cache_hit_tokens and prompt_cache_miss_tokens, giving you real-time visibility into cache efficiency

Better Structured Output — Thinking Mode Compatibility

Structured output is a high-frequency requirement for agent applications, but under DeepSeek's thinking mode, general-purpose frameworks' structured output solutions often conflict with reasoning_content management, resulting in unreliable output formats.

deepseek-kit's solution:

  • Zod Schema-Driven: Define output structures using Zod, gaining full TypeScript type inference
  • Smart Retry with Error Feedback: When output doesn't conform to the schema, formatted validation errors are automatically fed back to the model for correction, with up to 3 retries
  • Thinking Mode Compatible: The structured output flow fully preserves reasoning_content, never losing the chain-of-thought context during formatting steps

Core Features

  • 🧠 Thinking Mode Adaptation — Automatic reasoning_content management, zero-configuration tool calling chains
  • 💾 Cache Hit Rate Optimization — Zero-redundancy request body + deterministic message construction, maximizing DeepSeek cache benefits
  • 📋 Structured Output — Zod Schema-driven, smart retry, fully compatible with thinking mode
  • 🤖 Agent System — Build intelligent agents with tool calling capabilities
  • 💬 Streaming — Streaming events for text, chain-of-thought, and tool calls
  • 🔧 Tool Calling — Built-in tool definition, parameter validation, timeout, and retry
  • ✍️ FIM Completion — Fill-in-the-Middle code completion support
  • 🪝 Hook System — Insert custom logic before and after generation steps
  • 🔄 Auto Retry — Smart retry strategy with exponential backoff and jitter
  • 🌲 Tree-shakable — Pure ESM, sideEffects: false
  • 🔒 Type Safe — Complete TypeScript type definitions

Requirements

  • Node.js >= 18.0.0
  • DeepSeek API key

Quick Experience

Try it online — no local setup needed:

Or scaffold a project locally:

npx create-deepseek-kit my-agent

Quick Example

import { createAgent, createModel, tool } from 'deepseek-kit'
import { z } from 'zod'

const model = createModel({ model: 'deepseek-v4-flash' })

const weatherTool = tool({
  name: 'get_weather',
  description: 'Get weather information for a specified city',
  schema: z.object({
    city: z.string().describe('City name'),
  }),
  execute: async ({ city }) => `${city}: Sunny, 25°C`,
})

const agent = createAgent({
  model,
  tools: [weatherTool],
})

const result = await agent.generate({
  prompt: 'How\'s the weather in Chongqing today?',
})

console.log(result.text)

Community

If you have any questions about deepseek-kit, feel free to ask on GitHub Issues.

MIT

MIT