calute.context.compaction_strategies#

Context compaction strategies for managing conversation history.

This module provides various strategies for compacting conversation history when context length exceeds token limits. Each strategy implements a different approach to reducing context size while preserving relevant information.

The strategies range from simple truncation to intelligent summarization using LLM capabilities, allowing for flexible context management based on requirements and available resources.

Key Components:
  • BaseCompactionStrategy: Abstract base class defining the interface

  • SummarizationStrategy: LLM-based conversation summarization

  • SlidingWindowStrategy: Recent message retention with window

  • PriorityBasedStrategy: Importance-based message selection

  • SummarizationStrategy: LLM-based summarization

  • TruncateStrategy: Simple truncation for emergency cases

Example

>>> from calute.context import get_compaction_strategy
>>> from calute.types import CompactionStrategy
>>> strategy = get_compaction_strategy(
...     strategy=CompactionStrategy.SLIDING_WINDOW,
...     target_tokens=4000,
...     model="gpt-4"
... )
>>> compacted, stats = strategy.compact(messages)
class calute.context.compaction_strategies.BaseCompactionStrategy(target_tokens: int, model: str = 'gpt-4', preserve_system: bool = True, preserve_recent: int = 3)[source]#

Bases: ABC

Base class for context compaction strategies.

Provides the foundational interface and common functionality for all compaction strategies. Subclasses must implement the compact() method to define their specific compaction logic.

target_tokens#

Target number of tokens after compaction.

model#

Model name for accurate token counting.

preserve_system#

Whether to preserve system messages during compaction.

preserve_recent#

Number of recent messages to always preserve.

token_counter#

SmartTokenCounter instance for token counting.

abstract compact(messages: list[dict[str, str]], metadata: dict[str, Any] | None = None) tuple[list[dict[str, str]], dict[str, Any]][source]#

Compact the message history.

Parameters
  • messages – List of message dictionaries

  • metadata – Optional metadata about messages

Returns

Tuple of (compacted_messages, compaction_stats)

class calute.context.compaction_strategies.PriorityBasedStrategy(priority_scorer: collections.abc.Callable | None = None, **kwargs)[source]#

Bases: BaseCompactionStrategy

Compaction strategy based on message priority and importance.

Scores messages based on their importance and retains high-priority messages while removing lower-priority ones. This allows for more intelligent compaction that preserves critical conversation elements.

priority_scorer#

Callable that scores message priority (0-1).

compact(messages: list[dict[str, str]], metadata: dict[str, Any] | None = None) tuple[list[dict[str, str]], dict[str, Any]][source]#

Compact messages based on priority.

Parameters
  • messages – List of message dictionaries

  • metadata – Optional metadata with priority info

Returns

Compacted messages and statistics

class calute.context.compaction_strategies.SlidingWindowStrategy(target_tokens: int, model: str = 'gpt-4', preserve_system: bool = True, preserve_recent: int = 3)[source]#

Bases: BaseCompactionStrategy

Compaction strategy that keeps only recent messages.

Implements a sliding window approach where older messages are progressively removed to stay within token limits, while always preserving the most recent messages for context continuity.

This strategy is efficient and doesn’t require an LLM client, making it suitable for cost-sensitive applications.

compact(messages: list[dict[str, str]], metadata: dict[str, Any] | None = None) tuple[list[dict[str, str]], dict[str, Any]][source]#

Compact messages using sliding window.

Parameters
  • messages – List of message dictionaries

  • metadata – Optional metadata

Returns

Compacted messages and statistics

class calute.context.compaction_strategies.SummarizationStrategy(llm_client: Any | None = None, **kwargs)[source]#

Bases: BaseCompactionStrategy

Compaction strategy that uses LLM to summarize older messages.

This strategy leverages an LLM client to intelligently summarize older portions of conversation history, creating a condensed representation that preserves key information while reducing token count significantly.

llm_client#

LLM client instance for generating summaries.

compaction_agent#

Optional compaction agent for advanced summarization.

compact(messages: list[dict[str, str]], metadata: dict[str, Any] | None = None) tuple[list[dict[str, str]], dict[str, Any]][source]#

Compact messages using summarization.

Parameters
  • messages – List of message dictionaries

  • metadata – Optional metadata

Returns

Compacted messages and statistics

class calute.context.compaction_strategies.TruncateStrategy(target_tokens: int, model: str = 'gpt-4', preserve_system: bool = True, preserve_recent: int = 3)[source]#

Bases: BaseCompactionStrategy

Simple truncation strategy for emergency compaction.

Provides a straightforward truncation approach that removes older messages and truncates long message content. This is the simplest and fastest strategy, suitable when more sophisticated approaches are not needed or available.

Does not require an LLM client and has minimal computational overhead, making it ideal for resource-constrained situations.

compact(messages: list[dict[str, str]], metadata: dict[str, Any] | None = None) tuple[list[dict[str, str]], dict[str, Any]][source]#

Compact messages using simple truncation.

Parameters
  • messages – List of message dictionaries

  • metadata – Optional metadata

Returns

Compacted messages and statistics

calute.context.compaction_strategies.get_compaction_strategy(strategy: CompactionStrategy, target_tokens: int, model: str = 'gpt-4', llm_client: Any | None = None, **kwargs) BaseCompactionStrategy[source]#

Factory function to get a compaction strategy.

Parameters
  • strategy – The compaction strategy enum

  • target_tokens – Target number of tokens

  • model – Model name for token counting

  • llm_client – Optional LLM client

  • **kwargs – Additional strategy-specific arguments

Returns

Compaction strategy instance