Streaming
Streaming delivers agent responses incrementally rather than waiting for the full generation — significantly improving the interactive experience, especially for long texts and tool calling scenarios.
Large language models can take several seconds or even tens of seconds to generate long text. With traditional blocking calls, users can only stare at a loading animation waiting for the complete response. Streaming delivers generated content incrementally, allowing users to see the model's output almost in real time, greatly improving the interactive experience. deepseek-kit provides a clean streaming API through agent.stream(), supporting multiple event types including text deltas, reasoning deltas, and tool calls.
Basic Usage
Use agent.stream() instead of agent.generate() to get streaming output. stream() returns an async iterator that you can consume via for await...of:
The text-delta event carries a small text increment that you can directly append to the UI for a typewriter-like display effect.
Stream Event Types
deepseek-kit defines five stream event types, covering all key points in the agent's execution process:
text-delta — Text Delta
Each time the model generates a small piece of text, a text-delta event is triggered. This is the most commonly used event type for displaying the model's text output in real time:
reasoning-delta — Reasoning Delta
When the model has thinking mode enabled, the reasoning process is pushed as reasoning-delta events. You can use this to display the model's "thinking process":
reasoning-delta is only available when the model returns the reasoning_content field, which requires model and configuration support for thinking mode.
tool-call — Tool Call
When the agent calls a tool, the tool-call event is triggered after the tool execution completes, carrying information about all tool calls in that step:
step — Step Start
Triggered at the beginning of each new step, carrying the step number. You can use it to track the agent's execution progress:
finish — Generation Complete
Triggered after the agent completes all steps, carrying the final complete text and token usage:
Complete Event Handling
The following example shows how to handle all event types to build a complete streaming output experience:
Example output:
Streaming with Tool Calls
When the agent is equipped with tools, streaming output automatically includes tool call events. The flow works as follows:
- Step 1 — The model reasons and decides to call a tool, triggering
step→tool-callevents - Step 2 — The model generates a final response based on the tool result, triggering
step→text-delta→finishevents
You can switch display states in the UI based on event types — showing "Querying..." during tool calls and displaying results incrementally during text deltas:
Streaming with Structured Output
When the agent is configured with the output parameter, the structured output generation step is also pushed via stream events. During the structured output step, text-delta events carry JSON text deltas. The final parsed result needs to be obtained via agent.generate():
Aborting Streaming
You can abort an in-progress stream via AbortSignal:
The stream will be aborted after 5 seconds, and any content already received remains available.
Streaming vs Non-Streaming
| Feature | agent.generate() | agent.stream() |
|---|---|---|
| Return method | Waits for complete result and returns at once | Pushes events incrementally in real time |
| Return type | Promise<GenerateTextResult> | AsyncGenerator<StreamEvent> |
| Use case | Background tasks, batch processing | Chat interfaces, real-time interaction |
| Tool calls | Handled automatically, returns final result | Real-time notification via tool-call events |
| Token statistics | Returned in the result | Returned in the finish event |
Recommendations:
- Need real-time feedback — Use
stream(), e.g., chat applications, interactive tools - Only need final result — Use
generate(), e.g., batch processing, API backends
API Reference
agent.stream() Parameters
prompt. StreamEvent Types
textDelta field, carrying a small piece of generated text. reasoningDelta field, carrying a fragment of the model's thinking process (available when thinking mode is enabled). step (step number) and toolCalls (tool call array) fields. step (step number) field. text (complete text) and usage (token usage) fields. TextDeltaStreamEvent
ReasoningDeltaStreamEvent
ToolCallStreamEvent
id, function.name, and function.arguments. StepStreamEvent
FinishStreamEvent
prompt_tokens, completion_tokens, and total_tokens. 
