In today’s landscape of intelligent agent applications, it’s uncommon for systems to rely on a single language model or tool. With providers, models, and utilities evolving rapidly-sometimes every few weeks-maintaining a stable and adaptable integration stack becomes a significant challenge. Addressing this complexity, Moonshot AI introduces Kosong, an abstraction layer designed specifically for large language model (LLM) agent applications. Kosong harmonizes message formats, manages asynchronous tool execution, and supports interchangeable chat providers, enabling development teams to build flexible agents without binding their business logic to any one API. This framework also underpins Moonshot’s Kimi command-line interface (CLI).
Introducing Kosong: A Unified LLM Middleware
Kosong is a Python-based library that acts as an intermediary between your agent’s core logic and various LLM providers. It serves as a versatile abstraction layer tailored for modern agent architectures, offering example implementations that utilize the Kimi chat provider alongside high-level helper functions such as generate and step.
The library’s public API is deliberately concise to streamline adoption. At its core, developers interact with kosong.generate, kosong.step, and result types like GenerateResult and StepResult. Supporting modules-chat_provider, message, tooling, and tooling.simple-encapsulate provider-specific streaming protocols, token usage tracking, and tool invocation behind a consistent interface, simplifying integration across diverse backends.
Core Components: ChatProvider and Message Abstractions
The fundamental integration point within Kosong is the ChatProvider abstraction. For instance, Moonshot’s implementation for the Kimi provider resides in kosong.chat_provider.kimi. To initialize a Kimi provider, you supply parameters such as base_url, api_key, and the model identifier (e.g., kimi-k2-turbo-preview). This provider instance is then passed to kosong.generate or kosong.step along with a system prompt, a set of tools, and the conversation history.
Messages are encapsulated by the Message class from kosong.message. Each message includes a role designation (like "user") and a content field, which can be either a simple string or a list of content segments. This design supports richer, multimodal message payloads while maintaining simplicity for straightforward chat interactions.
Additionally, Kosong exposes a streaming component called StreamedMessagePart through kosong.chat_provider. During message generation, providers emit these parts incrementally, which Kosong then aggregates into a complete Message. To facilitate monitoring and cost management, an optional TokenUsage structure tracks token consumption in a provider-agnostic manner and attaches this data to result objects.
Extensible Tooling Framework: Toolset and SimpleToolset
Agent applications frequently require integration with external tools such as web search, code execution environments, or database queries. Kosong addresses this need through its tooling module. Tools are defined by subclassing CallableTool2 and specifying parameters using Pydantic models. For example, an AddTool might define attributes like name, description, and params, and implement a __call__ method that returns a ToolOk result, conforming to the ToolReturnType interface.
These tools are aggregated within a SimpleToolset from kosong.tooling.simple. Developers can instantiate a SimpleToolset and add tools using the += operator. This toolset is then supplied to kosong.step (not generate), where it manages the resolution of tool invocations from the model and routes them asynchronously to the appropriate functions. The step function orchestrates the entire process for a single conversational turn.
Single-Turn Chat Completion with generate
The generate function serves as the primary interface for straightforward chat completions. It requires a chat_provider, a system_prompt, an optional list of tools (which can be empty), and a history of Message objects. For example, the Kimi provider can be used to send a single user message as the conversation history with no tools enabled.
generate supports streaming output via an on_message_part callback. A typical implementation might define a function that prints each StreamedMessagePart as it arrives, providing real-time feedback. Once streaming concludes, generate returns a GenerateResult containing the fully merged assistant message and optionally, token usage statistics. This approach allows applications to display incremental responses while maintaining a clean, final message object for further processing.
Advanced Agent Interactions Using step
For agents that leverage external tools, Kosong offers the step function. In practice, kosong.step is called with a Kimi provider, a SimpleToolset containing tools like AddTool, a system prompt, and a user message history that instructs the model to invoke a specific tool (e.g., the add tool).
The step function returns a StepResult object. Developers can access result.message to retrieve the assistant’s response and await result.tool_results() to gather outputs from all tool invocations during that step. Kosong internally manages the orchestration of tool calls, including parsing arguments into Pydantic models and converting outputs into ToolReturnType results, eliminating the need for developers to build custom dispatch loops for each provider.
Demo Agent and Integration with Kimi CLI
Kosong includes a built-in demo agent that can be executed locally to showcase its capabilities. By setting environment variables such as KIMI_BASE_URL and KIMI_API_KEY, users can launch the demo using commands like uvicorn run python -m kosong kimi --with-bash. This demo leverages the Kimi chat provider and features a terminal-based agent capable of invoking various tools, including shell commands when the bash option is enabled.
Summary of Key Features
- Kosong acts as a comprehensive LLM abstraction layer from Moonshot AI, standardizing message formats, asynchronous tool management, and pluggable chat providers for agent development.
- The library offers a streamlined API with two primary functions:
generatefor simple chat completions andstepfor agents that utilize external tools, supported by abstractions likeChatProvider,Message,Tool, andToolset. - Currently, Kosong includes a
Kimichat provider tailored to the Moonshot AI API, with an extensibleChatProviderinterface that allows teams to integrate additional backends without modifying agent logic. - Tool definitions employ Pydantic models for parameter validation and
ToolReturnTypefor standardized results, enabling Kosong to handle argument parsing and tool orchestration internally withinstep. - Kosong serves as the foundational LLM abstraction layer for Moonshot’s Kimi CLI, which focuses on delivering a command-line agent experience compatible with Kimi and other backends.
Final Thoughts
Kosong represents a pragmatic and forward-thinking solution from Moonshot AI, effectively decoupling agent logic from the underlying LLM and tool infrastructures while maintaining a minimal and approachable API surface. By centering its design around core abstractions like ChatProvider, Message, and Toolset, Kosong provides a stable foundation for evolving agent systems without the need for extensive rewrites as models and tools advance. For development teams aiming to build scalable, maintainable agent frameworks, Kosong offers a robust and adaptable infrastructure layer.

