calute.context.token_counter#
Token counting utilities using provider-specific APIs.
This module provides accurate token counting for various LLM providers. It supports OpenAI (via tiktoken), Anthropic, Google, and includes fallback mechanisms when provider-specific libraries are unavailable.
The module includes:
ProviderTokenCounter: Low-level provider-specific token counting
SmartTokenCounter: High-level automatic provider detection and counting
Token counting is essential for managing context limits, estimating costs, and implementing effective compaction strategies.
Example
>>> from calute.context import SmartTokenCounter
>>> counter = SmartTokenCounter(model="gpt-4")
>>> token_count = counter.count_tokens("Hello, world!")
>>> remaining = counter.count_remaining_capacity("Hello", max_tokens=4096)
- class calute.context.token_counter.ProviderTokenCounter[source]#
Bases:
objectToken counter that uses actual provider APIs.
Provides static methods for counting tokens using provider-specific implementations. Supports OpenAI (tiktoken), Anthropic, and Google, with fallback to character-based estimation when libraries are unavailable.
This class is designed for internal use; prefer SmartTokenCounter for most use cases as it provides automatic provider detection.
- classmethod count_tokens_for_provider(text: str | list[dict[str, str]], provider: str | None = None, model: str | None = None, llm_client: Any | None = None) int[source]#
Count tokens using the appropriate provider’s API.
- Parameters
text – Text string or messages list to count
provider – Provider name (openai, anthropic, etc.)
model – Model name
llm_client – Optional LLM client instance
- Returns
Token count
- class calute.context.token_counter.SmartTokenCounter(provider: str | None = None, model: str | None = None, llm_client: Any | None = None)[source]#
Bases:
objectSmart token counter that automatically selects the best counting method.
Provides a high-level interface for token counting with automatic provider detection based on model name. Supports all major LLM providers and includes utility methods for capacity calculation and compression ratio estimation.
- provider#
Detected or specified provider name.
- model#
Model name for token counting.
- llm_client#
Optional LLM client for provider-specific counting.
- count_remaining_capacity(text: str | list[dict[str, str]], max_tokens: int) int[source]#
Calculate remaining token capacity.
- Parameters
text – Current text or messages to count.
max_tokens – Maximum token limit.
- Returns
Number of tokens remaining before reaching the limit.
- count_tokens(text: str | list[dict[str, str]]) int[source]#
Count tokens using the best available method.
- Parameters
text – Text string or list of message dictionaries.
- Returns
Token count for the given input.
- estimate_compression_ratio(original_text: str, compressed_text: str) float[source]#
Estimate the compression ratio achieved.
- Parameters
original_text – Original text before compression.
compressed_text – Text after compression.
- Returns
Compression ratio as a float between 0.0 and 1.0, where 1.0 means complete compression and 0.0 means no compression.