calute.calute#
Core Calute module implementing the main agent orchestration framework.
This module contains the primary Calute class that provides sophisticated agent management capabilities including: - Multi-agent orchestration and switching - Function/tool execution with retry logic - Memory system integration - Streaming response handling - Prompt template management - Asynchronous and synchronous execution modes
The module also includes prompt templating utilities and helper functions for formatting and parsing agent responses.
Key components: - Calute: Main orchestration class for managing AI agents - PromptSection: Enumeration for structured prompt sections - PromptTemplate: Configurable template for structuring agent prompts
- Typical usage example:
from calute import Calute, Agent from calute.llms import OpenAILLM
# Initialize Calute with an LLM llm = OpenAILLM(api_key=”your-api-key”) calute = Calute(llm=llm, enable_memory=True)
# Create and register an agent agent = Agent(
id=”assistant”, instructions=”You are a helpful assistant.”, model=”gpt-4”
) calute.register_agent(agent)
# Generate a response (streaming) for chunk in calute.run(prompt=”Hello!”):
- if chunk.content:
print(chunk.content, end=””)
# Generate a response (non-streaming) result = calute.run(prompt=”Hello!”, stream=False) print(result.content)
- class calute.calute.Calute(llm: calute.llms.base.BaseLLM | None = None, template: calute.core.prompt_template.PromptTemplate | None = None, enable_memory: bool = False, memory_config: dict[str, Any] | None = None, auto_add_memory_tools: bool = True, runtime_features: calute.runtime.features.RuntimeFeaturesConfig | None = None)[source]#
Bases:
objectMain Calute orchestration class for managing AI agents.
This is the primary interface for interacting with the Calute framework. It manages agent registration, prompt formatting, function execution, memory integration, and response generation with support for both streaming and non-streaming modes.
The Calute class provides: - Agent registration and orchestration across multiple agents - Automatic function/tool calling with retry logic - Memory system integration for context persistence - Both synchronous (run) and asynchronous (create_response) interfaces - Streaming and non-streaming response modes - Prompt template customization - Agent switching based on capabilities or error recovery
- SEP#
Class variable defining the separator used for indentation.
- Type
ClassVar[str]
- llm_client#
The BaseLLM instance for generating completions.
- template#
PromptTemplate for structuring agent prompts.
- orchestrator#
AgentOrchestrator for managing multi-agent workflows.
- executor#
FunctionExecutor for handling tool/function calls.
- enable_memory#
Whether the memory system is enabled.
- auto_add_memory_tools#
Whether to auto-add memory tools to agents.
- memory_store#
MemoryStore instance for persistent context (if enabled).
Example
Basic usage with streaming:
>>> from calute import Calute, Agent >>> from calute.llms import OpenAILLM >>> llm = OpenAILLM(api_key="your-key") >>> calute = Calute(llm=llm, enable_memory=True) >>> agent = Agent(id="helper", instructions="You are helpful.") >>> calute.register_agent(agent) >>> for chunk in calute.run(prompt="Hello!"): ... print(chunk.content, end="")
Non-streaming usage:
>>> result = calute.run(prompt="Hello!", stream=False) >>> print(result.content)
Async usage:
>>> async for chunk in await calute.create_response(prompt="Hi"): ... print(chunk.content, end="")
- REINVOKE_FOLLOWUP_INSTRUCTION: ClassVar[str] = "Use the function results above to continue the task. If the results already answer the user's request, respond to the user directly. Only call another function if the returned data is missing something necessary or the user explicitly asked for a fresh lookup. If a web/search tool already ran, do not claim you cannot browse or access current information. Treat search-result snippets as leads rather than verified facts; say that the search results indicate or suggest something unless you opened a source page and confirmed it."#
- SEP: ClassVar[str] = ' '#
- async athread_run(prompt: str | None = None, context_variables: dict | None = None, messages: calute.types.messages.MessagesHistory | None = None, agent_id: str | None | calute.types.agent_types.Agent = None, apply_functions: bool = True, print_formatted_prompt: bool = False, use_instructed_prompt: bool = False, conversation_name_holder: str = 'Messages', mention_last_turn: bool = True, reinvoke_after_function: bool = True, reinvoked_runtime: bool = False, streamer_buffer: calute.core.streamer_buffer.StreamerBuffer | None = None) tuple[calute.core.streamer_buffer.StreamerBuffer, _asyncio.Task][source]#
Async version of thread_run that creates a task instead of thread.
Returns immediately with a StreamerBuffer and the task handle.
- Parameters
stream (Same as create_response except) –
- Returns
StreamerBuffer: Buffer that will receive all streaming chunks
Task: The asyncio task handle for monitoring/awaiting
- Return type
Tuple of (StreamerBuffer, Task) where
Example
>>> buffer, task = await calute.athread_run(prompt="Hello") >>> async for chunk in buffer.astream(): ... print(chunk.content, end="") >>> await task
- bootstrap(extra_context: str = '')[source]#
Run the bootstrap sequence for this Calute instance.
Performs environment detection, git info loading, CALUTE.md loading, tool registration, and system prompt building.
- Parameters
extra_context – Additional context for the system prompt.
- Returns
A
BootstrapResult.
- create_query_engine(model: str = '', system_prompt: str = '', **config_kwargs: Any)[source]#
Create a fully-wired QueryEngine from this Calute instance.
The QueryEngine provides a multi-turn conversation interface with budget control, auto-compaction, cost tracking, and history logging.
- Parameters
model – Model name override. If empty, uses the current agent’s model.
system_prompt – System prompt override.
**config_kwargs – Additional QueryEngineConfig kwargs.
- Returns
A
QueryEngineinstance.
Example
>>> engine = calute.create_query_engine(model="gpt-4o") >>> result = engine.submit("Hello!") >>> print(result.output)
- async create_response(prompt: str | None = None, context_variables: dict | None = None, messages: calute.types.messages.MessagesHistory | None = None, agent_id: str | None | calute.types.agent_types.Agent = None, stream: bool = True, apply_functions: bool = True, print_formatted_prompt: bool = False, use_instructed_prompt: bool = False, conversation_name_holder: str = 'Messages', mention_last_turn: bool = True, reinvoke_after_function: bool = True, reinvoked_runtime: bool = False, streamer_buffer: calute.core.streamer_buffer.StreamerBuffer | None = None, _runtime_loop_detector: calute.runtime.loop_detection.LoopDetector | None = None, _runtime_turn_state: calute.calute._RuntimeTurnState | None = None) calute.types.function_execution_types.ResponseResult | collections.abc.AsyncIterator[calute.types.function_execution_types.StreamChunk | calute.types.function_execution_types.FunctionDetection | calute.types.function_execution_types.FunctionCallsExtracted | calute.types.function_execution_types.FunctionExecutionStart | calute.types.function_execution_types.FunctionExecutionComplete | calute.types.function_execution_types.AgentSwitch | calute.types.function_execution_types.Completion | calute.types.function_execution_types.ReinvokeSignal][source]#
Create response with enhanced function calling and agent switching.
Main async method for generating agent responses with support for streaming, function execution, and multi-agent orchestration.
- Parameters
prompt – Optional user prompt to process.
context_variables – Optional context variables for the agent.
messages – Optional message history.
agent_id – Optional specific agent ID or Agent instance to use.
stream – Whether to stream the response.
apply_functions – Whether to execute detected function calls.
print_formatted_prompt – Whether to print the formatted prompt.
use_instructed_prompt – Whether to use instructed prompt format.
conversation_name_holder – Name for conversation in instructed format.
mention_last_turn – Whether to mention last turn in instructed format.
reinvoke_after_function – Whether to reinvoke after function execution.
reinvoked_runtime – Internal flag indicating this is a reinvocation.
streamer_buffer – Optional buffer for streaming chunks.
- Returns
ResponseResult if stream=False, AsyncIterator[StreamingResponseType] if stream=True.
Example
>>> response = await calute.create_response( ... prompt="Calculate 5 + 3", ... stream=False ... ) >>> print(response.content)
- create_runtime_session(prompt: str = '') RuntimeSession[source]#
Create a new RuntimeSession capturing current context.
- Parameters
prompt – Initial prompt or session description.
- Returns
A
RuntimeSessioninstance.
- create_subagent_manager(max_concurrent: int = 5, max_depth: int = 5)[source]#
Create a SubAgentManager with the streaming agent loop wired up.
The manager uses a thread pool for concurrent sub-agent execution, supports git worktree isolation, named agents, and inbox queues.
- Parameters
max_concurrent – Maximum number of concurrent sub-agents.
max_depth – Maximum nesting depth.
- Returns
A
SubAgentManagerinstance.
Example
>>> mgr = calute.create_subagent_manager() >>> from calute.agents import get_agent_definition >>> task = mgr.spawn( ... prompt="Review this code", ... config={"model": "gpt-4o"}, ... system_prompt="You are helpful.", ... agent_def=get_agent_definition("reviewer"), ... name="code-review", ... ) >>> mgr.wait(task.id) >>> print(task.result)
- create_ui(target_agent: Agent = None)[source]#
Create and launch a user interface for interactive agent chat.
Launches a graphical user interface for interacting with the Calute agent system. The UI provides a chat-like interface for sending prompts and viewing streaming responses.
- Parameters
target_agent – Optional specific Agent to use in the UI. If None, uses the current active agent from the orchestrator.
- Returns
The launched application instance from the UI module.
Example
>>> calute = Calute(llm=my_llm) >>> calute.register_agent(my_agent) >>> calute.create_ui() # Launches interactive UI
Note
Requires the UI module dependencies to be installed.
- static extract_from_markdown(format: str, string: str) str | None | dict[source]#
Extract content from a markdown code block with specific format.
Searches for a markdown code block with the specified format identifier and extracts its content. If the content is valid JSON, it is parsed and returned as a dictionary.
- Parameters
format – The markdown format identifier to search for (e.g., ‘json’, ‘python’, ‘xml’). This is matched after the opening triple backticks.
string – The string containing the markdown block to search.
- Returns
If the block content is valid JSON - str: If the block content is not valid JSON - None: If no matching format block is found
- Return type
dict
Example
>>> content = '```json\n{"key": "value"}\n```' >>> Calute.extract_from_markdown("json", content) {'key': 'value'}
>>> content = '```python\nprint("hello")\n```' >>> Calute.extract_from_markdown("python", content) 'print("hello")'
Note
Only the first matching block is extracted if multiple exist.
- static extract_md_block(input_string: str) list[tuple[str, str]][source]#
Extract Markdown code blocks from a string.
This function finds all Markdown code blocks (delimited by triple backticks) in the input string and returns their content along with the optional language specifier (if present).
- Args:
input_string: The input string containing one or more Markdown code blocks.
- Returns:
- List of tuples, where each tuple contains:
The language specifier (e.g., ‘xml’, ‘python’, or ‘’ if not specified).
The content of the code block.
- Example:
>>> text = '''```xml ... <web_research> ... <arguments> ... {"query": "quantum computing breakthroughs 2024"} ... </arguments> ... </web_research> ... ```''' >>> Calute.extract_md_block(text) [('xml', '<web_research>
- <arguments>
{“query”: “quantum computing breakthroughs 2024”}
</arguments>
</web_research>’)]
- static filter_thoughts(response: str, tag: str = 'think') str[source]#
Remove all thinking tags from the response.
- Parameters
response – The response containing tagged thoughts.
tag – The XML tag name to remove (default: ‘think’).
- Returns
The response with all tagged sections removed.
Example
>>> response = "Answer <think>reasoning</think> continues" >>> Calute.filter_thoughts(response) 'Answer continues'
- format_chat_history(messages: MessagesHistory) str[source]#
Format chat messages with improved readability.
- Parameters
messages – MessagesHistory object containing chat messages.
- Returns
Formatted string representation of the chat history.
- format_context_variables(variables: dict[str, Any]) str[source]#
Format context variables with type information.
- Parameters
variables – Dictionary of context variables to format.
- Returns
Formatted string representation of variables with types and values.
- format_function_parameters(parameters: dict) str[source]#
Format function parameters in a clear, structured way.
- Parameters
parameters – Dictionary of parameter definitions from function schema.
- Returns
Formatted string representation of parameters with types, requirements, and descriptions.
- format_prompt(prompt: str | None) str[source]#
Format a prompt string.
- Parameters
prompt – The prompt to format.
- Returns
The formatted prompt or empty string if None.
- generate_function_section(functions: list[Union[Callable[[], Union[str, calute.types.agent_types.Agent, dict]], calute.types.agent_types.AgentBaseFn]]) str[source]#
Generate detailed function documentation for agent prompts.
Creates comprehensive documentation for available functions, organized by category if applicable, with full parameter schemas and examples.
- Parameters
functions – List of AgentFunction objects to document.
- Returns
Formatted string containing complete function documentation.
- get_execution_registry()[source]#
Get a populated ExecutionRegistry with all Calute tools.
- Returns
An
ExecutionRegistrywith all tools registered.
- static get_thoughts(response: str, tag: str = 'think') str | None[source]#
Extract thinking/reasoning content from tagged sections.
- Parameters
response – The response containing tagged thoughts.
tag – The XML tag name to extract (default: ‘think’).
- Returns
The content within the tags, or None if not found.
Example
>>> response = "Some text <think>Internal reasoning</think> more text" >>> Calute.get_thoughts(response) 'Internal reasoning'
- get_tool_executor()[source]#
Get a tool executor callable for the streaming agent loop.
- Returns
A callable
(tool_name: str, tool_input: dict) -> str.
- manage_messages(agent: calute.types.agent_types.Agent | None, prompt: str | None = None, context_variables: dict | None = None, messages: calute.types.messages.MessagesHistory | None = None, include_memory: bool = True, use_instructed_prompt: bool = False, use_chain_of_thought: bool = False, require_reflection: bool = False) MessagesHistory[source]#
Generate a structured list of ChatMessage objects for the LLM.
Constructs a properly formatted message history including system prompts, rules, functions, examples, context, and user messages based on the agent’s configuration and provided parameters.
- Parameters
agent – The agent to generate messages for.
prompt – Optional user prompt to include.
context_variables – Optional context variables to include.
messages – Optional existing message history.
include_memory – Whether to include memory context.
use_instructed_prompt – Whether to use instructed prompt format.
use_chain_of_thought – Whether to add chain-of-thought instructions.
require_reflection – Whether to request reflection in response.
- Returns
MessagesHistory containing the formatted messages.
Example
>>> messages = calute.manage_messages( ... agent=my_agent, ... prompt="Hello", ... use_chain_of_thought=True ... )
- register_agent(agent: Agent) None[source]#
Register an agent with the orchestrator.
Registers the agent for multi-agent orchestration, optionally adding memory tools if memory is enabled and auto_add_memory_tools is True. The first registered agent becomes the default active agent.
- Parameters
agent – The Agent instance to register for orchestration.
- Returns
None
- Side Effects:
Adds memory tools to agent if memory is enabled
Updates orchestrator’s agent registry
Sets agent as current if it’s the first registered
Example
>>> agent = Agent(id="helper", instructions="Be helpful") >>> calute.register_agent(agent)
- run(prompt: str | None = None, context_variables: dict | None = None, messages: calute.types.messages.MessagesHistory | None = None, agent_id: str | None | calute.types.agent_types.Agent = None, stream: bool = True, apply_functions: bool = True, print_formatted_prompt: bool = False, use_instructed_prompt: bool = False, conversation_name_holder: str = 'Messages', mention_last_turn: bool = True, reinvoke_after_function: bool = True, reinvoked_runtime: bool = False, streamer_buffer: calute.core.streamer_buffer.StreamerBuffer | None = None) calute.types.function_execution_types.ResponseResult | collections.abc.Generator[calute.types.function_execution_types.StreamChunk | calute.types.function_execution_types.FunctionDetection | calute.types.function_execution_types.FunctionCallsExtracted | calute.types.function_execution_types.FunctionExecutionStart | calute.types.function_execution_types.FunctionExecutionComplete | calute.types.function_execution_types.AgentSwitch | calute.types.function_execution_types.Completion | calute.types.function_execution_types.ReinvokeSignal, None, None][source]#
Synchronous wrapper for create_response.
Main synchronous interface for generating agent responses. Handles both streaming and non-streaming modes, with full support for function calling and agent orchestration.
- Parameters
prompt – Optional user prompt to process.
context_variables – Optional context variables for the agent.
messages – Optional message history.
agent_id – Optional specific agent ID or Agent instance to use.
stream – Whether to stream the response (True) or return complete (False).
apply_functions – Whether to execute detected function calls.
print_formatted_prompt – Whether to print the formatted prompt.
use_instructed_prompt – Whether to use instructed prompt format.
conversation_name_holder – Name for conversation in instructed format.
mention_last_turn – Whether to mention last turn in instructed format.
reinvoke_after_function – Whether to reinvoke after function execution.
reinvoked_runtime – Internal flag indicating this is a reinvocation.
streamer_buffer – Optional buffer for streaming chunks.
- Returns
Generator[StreamingResponseType] if stream=True, ResponseResult if stream=False.
Example
>>> >>> for chunk in calute.run(prompt="Hello", stream=True): ... if chunk.content: ... print(chunk.content, end="") >>> >>> >>> result = calute.run(prompt="Hello", stream=False) >>> print(result.content)
- thread_run(prompt: str | None = None, context_variables: dict | None = None, messages: calute.types.messages.MessagesHistory | None = None, agent_id: str | None | calute.types.agent_types.Agent = None, apply_functions: bool = True, print_formatted_prompt: bool = False, use_instructed_prompt: bool = False, conversation_name_holder: str = 'Messages', mention_last_turn: bool = True, reinvoke_after_function: bool = True, reinvoked_runtime: bool = False, streamer_buffer: calute.core.streamer_buffer.StreamerBuffer | None = None) tuple[calute.core.streamer_buffer.StreamerBuffer, threading.Thread][source]#
Run Calute in a background thread with automatic buffer creation.
Returns immediately with a StreamerBuffer and the thread handle. You can start consuming from the buffer while generation is happening. This is useful for non-blocking execution in synchronous contexts.
- Parameters
prompt – Optional user prompt to process.
context_variables – Optional context variables for the agent.
messages – Optional message history.
agent_id – Optional specific agent ID or Agent instance to use.
apply_functions – Whether to execute detected function calls.
print_formatted_prompt – Whether to print the formatted prompt.
use_instructed_prompt – Whether to use instructed prompt format.
conversation_name_holder – Name for conversation in instructed format.
mention_last_turn – Whether to mention last turn in instructed format.
reinvoke_after_function – Whether to reinvoke after function execution.
reinvoked_runtime – Internal flag indicating this is a reinvocation.
streamer_buffer – Optional pre-created buffer (creates new if None).
- Returns
StreamerBuffer: Buffer that will receive all streaming chunks
Thread: The background thread handle for monitoring/joining
- Return type
Tuple of (StreamerBuffer, Thread) where
Example
>>> buffer, thread = calute.thread_run(prompt="Hello") >>> for chunk in buffer.stream(): ... print(chunk.content, end="") >>> thread.join() >>> >>> >>> result = buffer.get_result(timeout=30) >>> print(result.content)
- class calute.calute.PromptSection(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
EnumEnumeration of different sections in a structured prompt.
This enum defines the standard sections that can be included in a structured prompt template, allowing for consistent prompt organization across different agents and use cases.
- SYSTEM#
System-level instructions and configuration.
- PERSONA#
Agent personality and role definition.
- RULES#
Behavioral rules and constraints for the agent.
- FUNCTIONS#
Available function/tool definitions.
- TOOLS#
Tool usage instructions and format specifications.
- EXAMPLES#
Example interactions for few-shot learning.
- CONTEXT#
Contextual information and variables.
- HISTORY#
Conversation history from previous turns.
- PROMPT#
The actual user prompt/query.
Example
>>> template = PromptTemplate( ... sections={PromptSection.SYSTEM: "INSTRUCTIONS:"}, ... section_order=[PromptSection.SYSTEM, PromptSection.PROMPT] ... )
- CONTEXT = 'context'#
- EXAMPLES = 'examples'#
- FUNCTIONS = 'functions'#
- HISTORY = 'history'#
- PERSONA = 'persona'#
- PROMPT = 'prompt'#
- RULES = 'rules'#
- SYSTEM = 'system'#
- TOOLS = 'tools'#
- class calute.calute.PromptTemplate(sections: dict[calute.core.prompt_template.PromptSection, str] | None = None, section_order: list[calute.core.prompt_template.PromptSection] | None = None)[source]#
Bases:
objectConfigurable template for structuring agent prompts.
This class provides a flexible way to structure prompts with different sections that can be customized or reordered based on requirements.
- sections#
Dictionary mapping PromptSection enums to their header strings.
- Type
dict[calute.core.prompt_template.PromptSection, str] | None
- section_order#
List defining the order in which sections appear in the prompt.
- Type
list[calute.core.prompt_template.PromptSection] | None
Example
>>> template = PromptTemplate( ... sections={PromptSection.SYSTEM: "INSTRUCTIONS:"}, ... section_order=[PromptSection.SYSTEM, PromptSection.PROMPT] ... )
- section_order: list[calute.core.prompt_template.PromptSection] | None = None#
- sections: dict[calute.core.prompt_template.PromptSection, str] | None = None#