calute.context.compaction_strategies#
Context compaction strategies for managing conversation history.
This module provides various strategies for compacting conversation history when context length exceeds token limits. Each strategy implements a different approach to reducing context size while preserving relevant information.
The strategies range from simple truncation to intelligent summarization using LLM capabilities, allowing for flexible context management based on requirements and available resources.
- Key Components:
BaseCompactionStrategy: Abstract base class defining the interface
SummarizationStrategy: LLM-based conversation summarization
SlidingWindowStrategy: Recent message retention with window
PriorityBasedStrategy: Importance-based message selection
SummarizationStrategy: LLM-based summarization
TruncateStrategy: Simple truncation for emergency cases
Example
>>> from calute.context import get_compaction_strategy
>>> from calute.types import CompactionStrategy
>>> strategy = get_compaction_strategy(
... strategy=CompactionStrategy.SLIDING_WINDOW,
... target_tokens=4000,
... model="gpt-4"
... )
>>> compacted, stats = strategy.compact(messages)
- class calute.context.compaction_strategies.BaseCompactionStrategy(target_tokens: int, model: str = 'gpt-4', preserve_system: bool = True, preserve_recent: int = 3)[source]#
Bases:
ABCBase class for context compaction strategies.
Provides the foundational interface and common functionality for all compaction strategies. Subclasses must implement the compact() method to define their specific compaction logic.
- target_tokens#
Target number of tokens after compaction.
- model#
Model name for accurate token counting.
- preserve_system#
Whether to preserve system messages during compaction.
- preserve_recent#
Number of recent messages to always preserve.
- token_counter#
SmartTokenCounter instance for token counting.
- abstract compact(messages: list[dict[str, str]], metadata: dict[str, Any] | None = None) tuple[list[dict[str, str]], dict[str, Any]][source]#
Compact the message history.
- Parameters
messages – List of message dictionaries
metadata – Optional metadata about messages
- Returns
Tuple of (compacted_messages, compaction_stats)
- class calute.context.compaction_strategies.PriorityBasedStrategy(priority_scorer: collections.abc.Callable | None = None, **kwargs)[source]#
Bases:
BaseCompactionStrategyCompaction strategy based on message priority and importance.
Scores messages based on their importance and retains high-priority messages while removing lower-priority ones. This allows for more intelligent compaction that preserves critical conversation elements.
- priority_scorer#
Callable that scores message priority (0-1).
- compact(messages: list[dict[str, str]], metadata: dict[str, Any] | None = None) tuple[list[dict[str, str]], dict[str, Any]][source]#
Compact messages based on priority.
- Parameters
messages – List of message dictionaries
metadata – Optional metadata with priority info
- Returns
Compacted messages and statistics
- class calute.context.compaction_strategies.SlidingWindowStrategy(target_tokens: int, model: str = 'gpt-4', preserve_system: bool = True, preserve_recent: int = 3)[source]#
Bases:
BaseCompactionStrategyCompaction strategy that keeps only recent messages.
Implements a sliding window approach where older messages are progressively removed to stay within token limits, while always preserving the most recent messages for context continuity.
This strategy is efficient and doesn’t require an LLM client, making it suitable for cost-sensitive applications.
- class calute.context.compaction_strategies.SummarizationStrategy(llm_client: Any | None = None, **kwargs)[source]#
Bases:
BaseCompactionStrategyCompaction strategy that uses LLM to summarize older messages.
This strategy leverages an LLM client to intelligently summarize older portions of conversation history, creating a condensed representation that preserves key information while reducing token count significantly.
- llm_client#
LLM client instance for generating summaries.
- compaction_agent#
Optional compaction agent for advanced summarization.
- class calute.context.compaction_strategies.TruncateStrategy(target_tokens: int, model: str = 'gpt-4', preserve_system: bool = True, preserve_recent: int = 3)[source]#
Bases:
BaseCompactionStrategySimple truncation strategy for emergency compaction.
Provides a straightforward truncation approach that removes older messages and truncates long message content. This is the simplest and fastest strategy, suitable when more sophisticated approaches are not needed or available.
Does not require an LLM client and has minimal computational overhead, making it ideal for resource-constrained situations.
- calute.context.compaction_strategies.get_compaction_strategy(strategy: CompactionStrategy, target_tokens: int, model: str = 'gpt-4', llm_client: Any | None = None, **kwargs) BaseCompactionStrategy[source]#
Factory function to get a compaction strategy.
- Parameters
strategy – The compaction strategy enum
target_tokens – Target number of tokens
model – Model name for token counting
llm_client – Optional LLM client
**kwargs – Additional strategy-specific arguments
- Returns
Compaction strategy instance