calute.context.token_counter

calute.context.token_counter#

Token counting utilities using provider-specific APIs.

This module provides accurate token counting for various LLM providers. It supports OpenAI (via tiktoken), Anthropic, Google, and includes fallback mechanisms when provider-specific libraries are unavailable.

The module includes:

ProviderTokenCounter: Low-level provider-specific token counting
SmartTokenCounter: High-level automatic provider detection and counting

Token counting is essential for managing context limits, estimating costs, and implementing effective compaction strategies.

Example

>>> from calute.context import SmartTokenCounter
>>> counter = SmartTokenCounter(model="gpt-4")
>>> token_count = counter.count_tokens("Hello, world!")
>>> remaining = counter.count_remaining_capacity("Hello", max_tokens=4096)

class calute.context.token_counter.ProviderTokenCounter[source]#

Bases: object

Token counter that uses actual provider APIs.

Provides static methods for counting tokens using provider-specific implementations. Supports OpenAI (tiktoken), Anthropic, and Google, with fallback to character-based estimation when libraries are unavailable.

This class is designed for internal use; prefer SmartTokenCounter for most use cases as it provides automatic provider detection.

classmethod count_tokens_for_provider(text: str | list[dict[str, str]], provider: str | None = None, model: str | None = None, llm_client: Any | None = None) → int[source]#

Count tokens using the appropriate provider’s API.

Parameters

text – Text string or messages list to count
provider – Provider name (openai, anthropic, etc.)
model – Model name
llm_client – Optional LLM client instance

Returns

Token count

class calute.context.token_counter.SmartTokenCounter(provider: str | None = None, model: str | None = None, llm_client: Any | None = None)[source]#

Bases: object

Smart token counter that automatically selects the best counting method.

Provides a high-level interface for token counting with automatic provider detection based on model name. Supports all major LLM providers and includes utility methods for capacity calculation and compression ratio estimation.

provider#: Detected or specified provider name.

model#: Model name for token counting.

llm_client#: Optional LLM client for provider-specific counting.

count_remaining_capacity(text: str | list[dict[str, str]], max_tokens: int) → int[source]#

Calculate remaining token capacity.

Parameters

text – Current text or messages to count.
max_tokens – Maximum token limit.

Returns

Number of tokens remaining before reaching the limit.

count_tokens(text: str | list[dict[str, str]]) → int[source]#

Count tokens using the best available method.

Parameters: text – Text string or list of message dictionaries.
Returns: Token count for the given input.

estimate_compression_ratio(original_text: str, compressed_text: str) → float[source]#

Estimate the compression ratio achieved.

Parameters

original_text – Original text before compression.
compressed_text – Text after compression.

Returns

Compression ratio as a float between 0.0 and 1.0, where 1.0 means complete compression and 0.0 means no compression.

calute.context.token_counter

Contents

calute.context.token_counter#