AI SDK Integration

Use deepseek-kit alongside Vercel AI SDK — DeepSeek handles text reasoning while AI SDK handles multimodal and UI integration.

Vercel AI SDK is a popular AI application development framework that supports multiple model providers and rich UI integration hooks. By integrating deepseek-kit with AI SDK, you can leverage AI SDK's multimodal capabilities (image understanding, file processing, etc.) to complement DeepSeek V4's limitations, while enjoying DeepSeek's low-cost, high-performance text reasoning.

Installation

pnpm add deepseek-kit ai @ai-sdk/openai zod

Pattern 1: deepseek-kit as an AI SDK Subagent

In this pattern, AI SDK's ToolLoopAgent serves as the main agent handling multimodal input, while the deepseek-kit agent is wrapped as a tool handling text tasks that require deep reasoning and tool calling.

Scenario: Image Analysis + Deep Research

A user sends an image, AI SDK's multimodal model first understands the image content, then passes the analysis result to the DeepSeek agent for in-depth research:

import { openai } from '@ai-sdk/openai'
import { tool as aiTool, ToolLoopAgent } from 'ai'
import { createAgent, createModel, tool } from 'deepseek-kit'
import { z } from 'zod'

const deepseekModel = createModel({ model: 'deepseek-v4-flash' })

const researchAgent = createAgent({
  model: deepseekModel,
  system: 'You are a research assistant. Conduct in-depth analysis based on provided information and clearly summarize your findings in your final response.',
  tools: [searchTool],
})

const researchTool = aiTool({
  description: 'Use DeepSeek to conduct in-depth research and analysis on a given topic',
  parameters: z.object({
    topic: z.string().describe('The topic to research'),
    context: z.string().describe('Related context information, such as image analysis results'),
  }),
  execute: async ({ topic, context }) => {
    const result = await researchAgent.generate({
      prompt: `Based on the following context, research this topic in depth: ${topic}\n\nContext: ${context}`,
    })
    return result.text
  },
})

const mainAgent = new ToolLoopAgent({
  model: openai('gpt-4o'),
  tools: { research: researchTool },
})

When a user sends an image:

  1. GPT-4o understands the image content and extracts key information
  2. GPT-4o determines that in-depth research is needed and calls the research tool
  3. The DeepSeek agent executes the research task in an isolated context
  4. The research result is returned to GPT-4o, which generates the final response

Scenario: Cost-Optimized Routing

Simple tasks go to DeepSeek, complex multimodal tasks go to GPT-4o:

import { openai } from '@ai-sdk/openai'
import { tool as aiTool, generateText } from 'ai'
import { createAgent, createModel, tool } from 'deepseek-kit'
import { z } from 'zod'

const deepseekModel = createModel({ model: 'deepseek-v4-flash' })

const deepseekAgent = createAgent({
  model: deepseekModel,
  system: 'You are an efficient text processing assistant. Complete tasks concisely and accurately.',
})

const textProcessingTool = aiTool({
  description: 'Process text-only tasks such as writing, translation, summarization, code generation, etc.',
  parameters: z.object({
    task: z.string().describe('The text task to process'),
  }),
  execute: async ({ task }) => {
    const result = await deepseekAgent.generate({ prompt: task })
    return result.text
  },
})

const result = await generateText({
  model: openai('gpt-4o'),
  tools: { processText: textProcessingTool },
  messages: [
    {
      role: 'user',
      content: [
        { type: 'text', text: 'Analyze the architecture in this image, then write a technical document.' },
        { type: 'image', image: imageUrl },
      ],
    },
  ],
})

GPT-4o first understands the image, then delegates the document writing task to DeepSeek via the processText tool, saving costs.

Pattern 2: AI SDK as a deepseek-kit Subagent

In this pattern, the deepseek-kit agent serves as the primary orchestrator, calling AI SDK models when multimodal capabilities are needed.

Scenario: Image Understanding Tool

Wrap AI SDK's multimodal model as a deepseek-kit tool for the DeepSeek agent to call on demand:

import { openai } from '@ai-sdk/openai'
import { generateText } from 'ai'
import { createAgent, createModel, tool } from 'deepseek-kit'
import { z } from 'zod'

const deepseekModel = createModel({ model: 'deepseek-v4-flash' })

const analyzeImageTool = tool({
  name: 'analyzeImage',
  description: 'Analyze image content and return a detailed text description',
  schema: z.object({
    imageUrl: z.string().describe('URL of the image'),
    question: z.string().describe('Question about the image'),
  }),
  execute: async (input) => {
    const result = await generateText({
      model: openai('gpt-4o'),
      messages: [
        {
          role: 'user',
          content: [
            { type: 'text', text: input.question },
            { type: 'image', image: input.imageUrl },
          ],
        },
      ],
    })
    return result.text
  },
})

const agent = createAgent({
  model: deepseekModel,
  tools: [analyzeImageTool, searchTool],
  system: 'You are an assistant. When users send images, use the analyzeImage tool to understand the image content, then answer questions based on the analysis.',
})

const result = await agent.generate({
  prompt: 'What architectural style is the building in this image? Please provide a detailed analysis with historical context.',
})

Scenario: Multi-Model Collaboration

DeepSeek serves as the primary orchestrator, dynamically selecting models based on task type:

import { anthropic } from '@ai-sdk/anthropic'
import { openai } from '@ai-sdk/openai'
import { generateText } from 'ai'
import { createAgent, createModel, tool } from 'deepseek-kit'
import { z } from 'zod'

const deepseekModel = createModel({ model: 'deepseek-v4-flash' })

const visionTool = tool({
  name: 'analyzeImage',
  description: 'Analyze images using a vision model',
  schema: z.object({
    imageUrl: z.string(),
    question: z.string(),
  }),
  execute: async (input) => {
    const result = await generateText({
      model: openai('gpt-4o'),
      messages: [{
        role: 'user',
        content: [
          { type: 'text', text: input.question },
          { type: 'image', image: input.imageUrl },
        ],
      }],
    })
    return result.text
  },
})

const longContextTool = tool({
  name: 'analyzeLongDocument',
  description: 'Analyze ultra-long documents with 200K token context support',
  schema: z.object({
    document: z.string(),
    question: z.string(),
  }),
  execute: async (input) => {
    const result = await generateText({
      model: anthropic('claude-sonnet-4-20250514'),
      prompt: `Analyze the following document and answer the question:\n\n${input.document}\n\nQuestion: ${input.question}`,
    })
    return result.text
  },
})

const agent = createAgent({
  model: deepseekModel,
  tools: [visionTool, longContextTool, searchTool],
  system: 'You are an intelligent orchestration assistant. Select the appropriate tool based on task type: use analyzeImage for image analysis, analyzeLongDocument for long document analysis, and handle other tasks directly.',
})

Streaming Integration

When a deepseek-kit agent serves as a subagent, its internal stream events don't propagate to AI SDK's stream. If you need to display subagent progress in the UI, you can implement custom progress callbacks in the tool:

const researchTool = aiTool({
  description: 'Use DeepSeek for in-depth research',
  parameters: z.object({ topic: z.string() }),
  execute: async ({ topic }) => {
    const agent = createAgent({ model: deepseekModel, tools: [searchTool] })

    let fullText = ''
    const stream = agent.stream({ prompt: topic })

    for await (const event of stream) {
      if (event.type === 'text-delta') {
        fullText += event.textDelta
      }
      if (event.type === 'tool-call') {
        console.log(`[DeepSeek] Calling tool: ${event.toolCalls.map(t => t.function.name).join(', ')}`)
      }
    }

    return fullText
  },
})

Considerations

  • Context Isolation — deepseek-kit subagents have their own isolated context window and don't inherit the AI SDK main agent's conversation history. If you need to pass context, manually construct it in the tool's execute function
  • Latency Stacking — Subagent execution time stacks onto the main agent's total latency. For simple text tasks, consider using DeepSeek directly rather than routing through AI SDK
  • Error Propagation — Errors in subagents are returned to the main agent as tool execution failures and don't directly interrupt the main flow
  • API Keys — Make sure both DEEPSEEK_API_KEY and the corresponding AI SDK model's API key (e.g., OPENAI_API_KEY) are configured