calute.calute

calute.calute#

Core Calute module implementing the main agent orchestration framework.

This module contains the primary Calute class that provides sophisticated agent management capabilities including: - Multi-agent orchestration and switching - Function/tool execution with retry logic - Memory system integration - Streaming response handling - Prompt template management - Asynchronous and synchronous execution modes

The module also includes prompt templating utilities and helper functions for formatting and parsing agent responses.

Key components: - Calute: Main orchestration class for managing AI agents - PromptSection: Enumeration for structured prompt sections - PromptTemplate: Configurable template for structuring agent prompts

Typical usage example:

from calute import Calute, Agent from calute.llms import OpenAILLM

# Initialize Calute with an LLM llm = OpenAILLM(api_key=”your-api-key”) calute = Calute(llm=llm, enable_memory=True)

# Create and register an agent agent = Agent(

id=”assistant”, instructions=”You are a helpful assistant.”, model=”gpt-4”

) calute.register_agent(agent)

# Generate a response (streaming) for chunk in calute.run(prompt=”Hello!”):

if chunk.content:
print(chunk.content, end=””)

# Generate a response (non-streaming) result = calute.run(prompt=”Hello!”, stream=False) print(result.content)

class calute.calute.Calute(llm: calute.llms.base.BaseLLM | None = None, template: calute.core.prompt_template.PromptTemplate | None = None, enable_memory: bool = False, memory_config: dict[str, Any] | None = None, auto_add_memory_tools: bool = True, runtime_features: calute.runtime.features.RuntimeFeaturesConfig | None = None)[source]#

Bases: object

Main Calute orchestration class for managing AI agents.

This is the primary interface for interacting with the Calute framework. It manages agent registration, prompt formatting, function execution, memory integration, and response generation with support for both streaming and non-streaming modes.

The Calute class provides: - Agent registration and orchestration across multiple agents - Automatic function/tool calling with retry logic - Memory system integration for context persistence - Both synchronous (run) and asynchronous (create_response) interfaces - Streaming and non-streaming response modes - Prompt template customization - Agent switching based on capabilities or error recovery

SEP#

Class variable defining the separator used for indentation.

Type: ClassVar[str]

llm_client#: The BaseLLM instance for generating completions.

template#: PromptTemplate for structuring agent prompts.

orchestrator#: AgentOrchestrator for managing multi-agent workflows.

executor#: FunctionExecutor for handling tool/function calls.

enable_memory#: Whether the memory system is enabled.

auto_add_memory_tools#: Whether to auto-add memory tools to agents.

memory_store#: MemoryStore instance for persistent context (if enabled).

Example

Basic usage with streaming:

>>> from calute import Calute, Agent
>>> from calute.llms import OpenAILLM
>>> llm = OpenAILLM(api_key="your-key")
>>> calute = Calute(llm=llm, enable_memory=True)
>>> agent = Agent(id="helper", instructions="You are helpful.")
>>> calute.register_agent(agent)
>>> for chunk in calute.run(prompt="Hello!"):
...     print(chunk.content, end="")

Non-streaming usage:

>>> result = calute.run(prompt="Hello!", stream=False)
>>> print(result.content)

Async usage:

>>> async for chunk in await calute.create_response(prompt="Hi"):
...     print(chunk.content, end="")

REINVOKE_FOLLOWUP_INSTRUCTION: ClassVar[str] = "Use the function results above to continue the task. If the results already answer the user's request, respond to the user directly. Only call another function if the returned data is missing something necessary or the user explicitly asked for a fresh lookup. If a web/search tool already ran, do not claim you cannot browse or access current information. Treat search-result snippets as leads rather than verified facts; say that the search results indicate or suggest something unless you opened a source page and confirmed it."#

SEP: ClassVar[str] = ' '#

async athread_run(prompt: str | None = None, context_variables: dict | None = None, messages: calute.types.messages.MessagesHistory | None = None, agent_id: str | None | calute.types.agent_types.Agent = None, apply_functions: bool = True, print_formatted_prompt: bool = False, use_instructed_prompt: bool = False, conversation_name_holder: str = 'Messages', mention_last_turn: bool = True, reinvoke_after_function: bool = True, reinvoked_runtime: bool = False, streamer_buffer: calute.core.streamer_buffer.StreamerBuffer | None = None) → tuple[calute.core.streamer_buffer.StreamerBuffer, _asyncio.Task][source]#

Async version of thread_run that creates a task instead of thread.

Returns immediately with a StreamerBuffer and the task handle.

Parameters

stream (Same as create_response except) –

Returns

StreamerBuffer: Buffer that will receive all streaming chunks
Task: The asyncio task handle for monitoring/awaiting

Return type

Tuple of (StreamerBuffer, Task) where

Example

>>> buffer, task = await calute.athread_run(prompt="Hello")
>>> async for chunk in buffer.astream():
...     print(chunk.content, end="")
>>> await task

bootstrap(extra_context: str = '')[source]#

Run the bootstrap sequence for this Calute instance.

Performs environment detection, git info loading, CALUTE.md loading, tool registration, and system prompt building.

Parameters: extra_context – Additional context for the system prompt.
Returns: A BootstrapResult.

create_query_engine(model: str = '', system_prompt: str = '', **config_kwargs: Any)[source]#

Create a fully-wired QueryEngine from this Calute instance.

The QueryEngine provides a multi-turn conversation interface with budget control, auto-compaction, cost tracking, and history logging.

Parameters

model – Model name override. If empty, uses the current agent’s model.
system_prompt – System prompt override.
**config_kwargs – Additional QueryEngineConfig kwargs.

Returns

A QueryEngine instance.

Example

>>> engine = calute.create_query_engine(model="gpt-4o")
>>> result = engine.submit("Hello!")
>>> print(result.output)

async create_response(prompt: str | None = None, context_variables: dict | None = None, messages: calute.types.messages.MessagesHistory | None = None, agent_id: str | None | calute.types.agent_types.Agent = None, stream: bool = True, apply_functions: bool = True, print_formatted_prompt: bool = False, use_instructed_prompt: bool = False, conversation_name_holder: str = 'Messages', mention_last_turn: bool = True, reinvoke_after_function: bool = True, reinvoked_runtime: bool = False, streamer_buffer: calute.core.streamer_buffer.StreamerBuffer | None = None, _runtime_loop_detector: calute.runtime.loop_detection.LoopDetector | None = None, _runtime_turn_state: calute.calute._RuntimeTurnState | None = None) → calute.types.function_execution_types.ResponseResult | collections.abc.AsyncIterator[calute.types.function_execution_types.StreamChunk | calute.types.function_execution_types.FunctionDetection | calute.types.function_execution_types.FunctionCallsExtracted | calute.types.function_execution_types.FunctionExecutionStart | calute.types.function_execution_types.FunctionExecutionComplete | calute.types.function_execution_types.AgentSwitch | calute.types.function_execution_types.Completion | calute.types.function_execution_types.ReinvokeSignal][source]#

Create response with enhanced function calling and agent switching.

Main async method for generating agent responses with support for streaming, function execution, and multi-agent orchestration.

Parameters

prompt – Optional user prompt to process.
context_variables – Optional context variables for the agent.
messages – Optional message history.
agent_id – Optional specific agent ID or Agent instance to use.
stream – Whether to stream the response.
apply_functions – Whether to execute detected function calls.
print_formatted_prompt – Whether to print the formatted prompt.
use_instructed_prompt – Whether to use instructed prompt format.
conversation_name_holder – Name for conversation in instructed format.
mention_last_turn – Whether to mention last turn in instructed format.
reinvoke_after_function – Whether to reinvoke after function execution.
reinvoked_runtime – Internal flag indicating this is a reinvocation.
streamer_buffer – Optional buffer for streaming chunks.

Returns

ResponseResult if stream=False, AsyncIterator[StreamingResponseType] if stream=True.

Example

>>> response = await calute.create_response(
...     prompt="Calculate 5 + 3",
...     stream=False
... )
>>> print(response.content)

create_runtime_session(prompt: str = '') → RuntimeSession[source]#

Create a new RuntimeSession capturing current context.

Parameters: prompt – Initial prompt or session description.
Returns: A RuntimeSession instance.

create_subagent_manager(max_concurrent: int = 5, max_depth: int = 5)[source]#

Create a SubAgentManager with the streaming agent loop wired up.

The manager uses a thread pool for concurrent sub-agent execution, supports git worktree isolation, named agents, and inbox queues.

Parameters

max_concurrent – Maximum number of concurrent sub-agents.
max_depth – Maximum nesting depth.

Returns

A SubAgentManager instance.

Example

>>> mgr = calute.create_subagent_manager()
>>> from calute.agents import get_agent_definition
>>> task = mgr.spawn(
...     prompt="Review this code",
...     config={"model": "gpt-4o"},
...     system_prompt="You are helpful.",
...     agent_def=get_agent_definition("reviewer"),
...     name="code-review",
... )
>>> mgr.wait(task.id)
>>> print(task.result)

create_ui(target_agent: Agent = None)[source]#

Create and launch a user interface for interactive agent chat.

Launches a graphical user interface for interacting with the Calute agent system. The UI provides a chat-like interface for sending prompts and viewing streaming responses.

Parameters: target_agent – Optional specific Agent to use in the UI. If None, uses the current active agent from the orchestrator.
Returns: The launched application instance from the UI module.

Example

>>> calute = Calute(llm=my_llm)
>>> calute.register_agent(my_agent)
>>> calute.create_ui()  # Launches interactive UI

Note

Requires the UI module dependencies to be installed.

static extract_from_markdown(format: str, string: str) → str | None | dict[source]#

Extract content from a markdown code block with specific format.

Searches for a markdown code block with the specified format identifier and extracts its content. If the content is valid JSON, it is parsed and returned as a dictionary.

Parameters

format – The markdown format identifier to search for (e.g., ‘json’, ‘python’, ‘xml’). This is matched after the opening triple backticks.
string – The string containing the markdown block to search.

Returns

If the block content is valid JSON - str: If the block content is not valid JSON - None: If no matching format block is found

Return type

dict

Example

>>> content = '```json\n{"key": "value"}\n```'
>>> Calute.extract_from_markdown("json", content)
{'key': 'value'}

>>> content = '```python\nprint("hello")\n```'
>>> Calute.extract_from_markdown("python", content)
'print("hello")'

Note

Only the first matching block is extracted if multiple exist.

static extract_md_block(input_string: str) → list[tuple[str, str]][source]#

Extract Markdown code blocks from a string.

This function finds all Markdown code blocks (delimited by triple backticks) in the input string and returns their content along with the optional language specifier (if present).
Args:
input_string: The input string containing one or more Markdown code blocks.

Returns:

List of tuples, where each tuple contains:

The language specifier (e.g., ‘xml’, ‘python’, or ‘’ if not specified).

The content of the code block.

Example:
>>> text = '''```xml
... <web_research>
...   <arguments>
...     {"query": "quantum computing breakthroughs 2024"}
...   </arguments>
... </web_research>
... ```'''
>>> Calute.extract_md_block(text)
[('xml', '<web_research>
<arguments>
{“query”: “quantum computing breakthroughs 2024”}

</arguments>

</web_research>’)]

static filter_thoughts(response: str, tag: str = 'think') → str[source]#

Remove all thinking tags from the response.

Parameters

response – The response containing tagged thoughts.
tag – The XML tag name to remove (default: ‘think’).

Returns

The response with all tagged sections removed.

Example

>>> response = "Answer <think>reasoning</think> continues"
>>> Calute.filter_thoughts(response)
'Answer continues'

format_chat_history(messages: MessagesHistory) → str[source]#

Format chat messages with improved readability.

Parameters: messages – MessagesHistory object containing chat messages.
Returns: Formatted string representation of the chat history.

format_context_variables(variables: dict[str, Any]) → str[source]#

Format context variables with type information.

Parameters: variables – Dictionary of context variables to format.
Returns: Formatted string representation of variables with types and values.

format_function_parameters(parameters: dict) → str[source]#

Format function parameters in a clear, structured way.

Parameters: parameters – Dictionary of parameter definitions from function schema.
Returns: Formatted string representation of parameters with types, requirements, and descriptions.

format_prompt(prompt: str | None) → str[source]#

Format a prompt string.

Parameters: prompt – The prompt to format.
Returns: The formatted prompt or empty string if None.

generate_function_section(functions: list[Union[Callable[[], Union[str, calute.types.agent_types.Agent, dict]], calute.types.agent_types.AgentBaseFn]]) → str[source]#

Generate detailed function documentation for agent prompts.

Creates comprehensive documentation for available functions, organized by category if applicable, with full parameter schemas and examples.

Parameters: functions – List of AgentFunction objects to document.
Returns: Formatted string containing complete function documentation.

get_execution_registry()[source]#

Get a populated ExecutionRegistry with all Calute tools.

Returns: An ExecutionRegistry with all tools registered.

static get_thoughts(response: str, tag: str = 'think') → str | None[source]#

Extract thinking/reasoning content from tagged sections.

Parameters

response – The response containing tagged thoughts.
tag – The XML tag name to extract (default: ‘think’).

Returns

The content within the tags, or None if not found.

Example

>>> response = "Some text <think>Internal reasoning</think> more text"
>>> Calute.get_thoughts(response)
'Internal reasoning'

get_tool_executor()[source]#

Get a tool executor callable for the streaming agent loop.

Returns: A callable (tool_name: str, tool_input: dict) -> str.

manage_messages(agent: calute.types.agent_types.Agent | None, prompt: str | None = None, context_variables: dict | None = None, messages: calute.types.messages.MessagesHistory | None = None, include_memory: bool = True, use_instructed_prompt: bool = False, use_chain_of_thought: bool = False, require_reflection: bool = False) → MessagesHistory[source]#

Generate a structured list of ChatMessage objects for the LLM.

Constructs a properly formatted message history including system prompts, rules, functions, examples, context, and user messages based on the agent’s configuration and provided parameters.

Parameters

agent – The agent to generate messages for.
prompt – Optional user prompt to include.
context_variables – Optional context variables to include.
messages – Optional existing message history.
include_memory – Whether to include memory context.
use_instructed_prompt – Whether to use instructed prompt format.
use_chain_of_thought – Whether to add chain-of-thought instructions.
require_reflection – Whether to request reflection in response.

Returns

MessagesHistory containing the formatted messages.

Example

>>> messages = calute.manage_messages(
...     agent=my_agent,
...     prompt="Hello",
...     use_chain_of_thought=True
... )

register_agent(agent: Agent) → None[source]#

Register an agent with the orchestrator.

Registers the agent for multi-agent orchestration, optionally adding memory tools if memory is enabled and auto_add_memory_tools is True. The first registered agent becomes the default active agent.

Parameters: agent – The Agent instance to register for orchestration.
Returns: None

Side Effects:

Adds memory tools to agent if memory is enabled
Updates orchestrator’s agent registry
Sets agent as current if it’s the first registered

Example

>>> agent = Agent(id="helper", instructions="Be helpful")
>>> calute.register_agent(agent)

Synchronous wrapper for create_response.

Main synchronous interface for generating agent responses. Handles both streaming and non-streaming modes, with full support for function calling and agent orchestration.

Parameters

prompt – Optional user prompt to process.
context_variables – Optional context variables for the agent.
messages – Optional message history.
agent_id – Optional specific agent ID or Agent instance to use.
stream – Whether to stream the response (True) or return complete (False).
apply_functions – Whether to execute detected function calls.
print_formatted_prompt – Whether to print the formatted prompt.
use_instructed_prompt – Whether to use instructed prompt format.
conversation_name_holder – Name for conversation in instructed format.
mention_last_turn – Whether to mention last turn in instructed format.
reinvoke_after_function – Whether to reinvoke after function execution.
reinvoked_runtime – Internal flag indicating this is a reinvocation.
streamer_buffer – Optional buffer for streaming chunks.

Returns

Generator[StreamingResponseType] if stream=True, ResponseResult if stream=False.

Example

>>>
>>> for chunk in calute.run(prompt="Hello", stream=True):
...     if chunk.content:
...         print(chunk.content, end="")
>>>
>>>
>>> result = calute.run(prompt="Hello", stream=False)
>>> print(result.content)

thread_run(prompt: str | None = None, context_variables: dict | None = None, messages: calute.types.messages.MessagesHistory | None = None, agent_id: str | None | calute.types.agent_types.Agent = None, apply_functions: bool = True, print_formatted_prompt: bool = False, use_instructed_prompt: bool = False, conversation_name_holder: str = 'Messages', mention_last_turn: bool = True, reinvoke_after_function: bool = True, reinvoked_runtime: bool = False, streamer_buffer: calute.core.streamer_buffer.StreamerBuffer | None = None) → tuple[calute.core.streamer_buffer.StreamerBuffer, threading.Thread][source]#

Run Calute in a background thread with automatic buffer creation.

Returns immediately with a StreamerBuffer and the thread handle. You can start consuming from the buffer while generation is happening. This is useful for non-blocking execution in synchronous contexts.

Parameters

prompt – Optional user prompt to process.
context_variables – Optional context variables for the agent.
messages – Optional message history.
agent_id – Optional specific agent ID or Agent instance to use.
apply_functions – Whether to execute detected function calls.
print_formatted_prompt – Whether to print the formatted prompt.
use_instructed_prompt – Whether to use instructed prompt format.
conversation_name_holder – Name for conversation in instructed format.
mention_last_turn – Whether to mention last turn in instructed format.
reinvoke_after_function – Whether to reinvoke after function execution.
reinvoked_runtime – Internal flag indicating this is a reinvocation.
streamer_buffer – Optional pre-created buffer (creates new if None).

Returns

StreamerBuffer: Buffer that will receive all streaming chunks
Thread: The background thread handle for monitoring/joining

Return type

Tuple of (StreamerBuffer, Thread) where

Example

>>> buffer, thread = calute.thread_run(prompt="Hello")
>>> for chunk in buffer.stream():
...     print(chunk.content, end="")
>>> thread.join()
>>>
>>>
>>> result = buffer.get_result(timeout=30)
>>> print(result.content)

class calute.calute.PromptSection(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

Enumeration of different sections in a structured prompt.

This enum defines the standard sections that can be included in a structured prompt template, allowing for consistent prompt organization across different agents and use cases.

SYSTEM#: System-level instructions and configuration.

PERSONA#: Agent personality and role definition.

RULES#: Behavioral rules and constraints for the agent.

FUNCTIONS#: Available function/tool definitions.

TOOLS#: Tool usage instructions and format specifications.

EXAMPLES#: Example interactions for few-shot learning.

CONTEXT#: Contextual information and variables.

HISTORY#: Conversation history from previous turns.

PROMPT#: The actual user prompt/query.

Example

>>> template = PromptTemplate(
...     sections={PromptSection.SYSTEM: "INSTRUCTIONS:"},
...     section_order=[PromptSection.SYSTEM, PromptSection.PROMPT]
... )

CONTEXT = 'context'#

EXAMPLES = 'examples'#

FUNCTIONS = 'functions'#

HISTORY = 'history'#

PERSONA = 'persona'#

PROMPT = 'prompt'#

RULES = 'rules'#

SYSTEM = 'system'#

TOOLS = 'tools'#

class calute.calute.PromptTemplate(sections: dict[calute.core.prompt_template.PromptSection, str] | None = None, section_order: list[calute.core.prompt_template.PromptSection] | None = None)[source]#

Bases: object

Configurable template for structuring agent prompts.

This class provides a flexible way to structure prompts with different sections that can be customized or reordered based on requirements.

sections#

Dictionary mapping PromptSection enums to their header strings.

Type: dict[calute.core.prompt_template.PromptSection, str] | None

section_order#

List defining the order in which sections appear in the prompt.

Type: list[calute.core.prompt_template.PromptSection] | None

Example

>>> template = PromptTemplate(
...     sections={PromptSection.SYSTEM: "INSTRUCTIONS:"},
...     section_order=[PromptSection.SYSTEM, PromptSection.PROMPT]
... )

section_order: list[calute.core.prompt_template.PromptSection] | None = None#

sections: dict[calute.core.prompt_template.PromptSection, str] | None = None#

calute.calute

Contents

calute.calute#