calute.tools.data_tools

calute.tools.data_tools#

Data processing and manipulation tools for Calute agents.

This module provides a comprehensive set of data processing tools for the Calute framework. It includes: - JSON data processing with load, save, query, and validation operations - CSV file processing with read, write, analyze, and convert capabilities - Advanced text processing with statistics, extraction, and formatting - Data format conversion between JSON, YAML, Base64, Hex, and hashes - Date and time processing with parsing, formatting, and delta calculations

Each tool is implemented as a class inheriting from AgentBaseFn, making them directly usable as agent tools for data manipulation tasks.

Example

>>> processor = JSONProcessor()
>>> result = processor(operation="load", file_path="data.json")
>>> print(result["data"])

class calute.tools.data_tools.CSVProcessor(name, bases, namespace, /, **kwargs)[source]#

Bases: AgentBaseFn

CSV data processing and manipulation tool.

Provides operations for reading, writing, analyzing, and converting CSV files. Supports custom delimiters, headers, and row limits.

Supported operations:: read: Read CSV file into a list of dictionaries. write: Write list of dictionaries to a CSV file. analyze: Get statistics about a CSV file structure. convert: Convert CSV data to JSON format.

static static_call(operation: str, file_path: str | None = None, data: list[dict] | None = None, delimiter: str = ',', headers: list[str] | None = None, has_header: bool = True, max_rows: int | None = None, **context_variables) → dict[str, Any][source]#

Process CSV data with various operations.

Performs read, write, analyze, or convert operations on CSV files. Supports custom delimiters, header configuration, and row limits.

Parameters

operation –
The operation to perform. Options: - “read”: Read a CSV file into a list of dictionaries.

Requires file_path.
- ”write”: Write a list of dictionaries to a CSV file. Requires file_path and data.
- ”analyze”: Get structural statistics about a CSV file including row/column counts, headers, sample data, and empty cell count. Requires file_path.
- ”convert”: Convert a CSV file to a list of JSON-like dictionaries. Requires file_path.
file_path – Path to the CSV file for read/write/analyze/convert.
data – List of dictionaries to write. Each dict represents a row with column names as keys. Required for the “write” operation.
delimiter – Column delimiter character. Defaults to “,”.
headers – Explicit column headers. For “write”, used as fieldnames; if not provided, inferred from the first data dict. For “read” with has_header=False, used as the column names.
has_header – Whether the CSV file’s first row is a header row. If False and no headers are provided, columns are auto-named as “col_0”, “col_1”, etc. Defaults to True.
max_rows – Maximum number of rows to read. None reads all rows.
**context_variables – Runtime context from the agent (unused).

Returns

For “read”: data (list[dict]), count (int), columns (list[str]). For “write”: success (bool), rows_written (int), file_path (str). For “analyze”: total_rows, total_columns, headers, sample_data,

empty_cells.

For “convert”: json (list[dict]), count (int). - error (str): Error message if the operation failed.

Return type

A dictionary containing operation-specific results

Example

>>> result = CSVProcessor.static_call("read", file_path="data.csv", max_rows=5)
>>> print(result["count"])
5

class calute.tools.data_tools.DataConverter(name, bases, namespace, /, **kwargs)[source]#

Bases: AgentBaseFn

Convert data between different formats.

Provides conversion between various data formats including JSON, YAML, Base64, hexadecimal, and cryptographic hashes. Supports bidirectional conversion where applicable.

Supported formats:: json: JSON string format. yaml: YAML format (requires PyYAML). base64: Base64 encoded string. hex: Hexadecimal string representation. hash: Generate MD5, SHA1, SHA256, and SHA512 hashes (output only).

static static_call(data: Any, from_format: str, to_format: str, encoding: str = 'utf-8', **context_variables) → dict[str, Any][source]#

Convert data between different formats.

First parses the input data from the source format into an intermediate Python object, then serializes it to the target format.

Parameters

data – Input data to convert. Can be a string (for json, yaml, base64, hex source formats) or a Python object (dict, list).
from_format – Source format of the data. Options: - “json”: JSON string or Python dict/list. - “yaml”: YAML string or Python object. Requires PyYAML. - “base64”: Base64-encoded string. - “hex”: Hexadecimal-encoded string.
to_format –
Target format to convert to. Options: - “json”: Pretty-printed JSON string. - “yaml”: YAML string. Requires PyYAML. - “base64”: Base64-encoded string. - “hex”: Hexadecimal string. - “hash”: Dictionary of cryptographic hashes (MD5, SHA1,

SHA256, SHA512). Output only.
encoding – Character encoding for encoding/decoding operations. Defaults to “utf-8”.
**context_variables – Runtime context from the agent (unused).

Returns

output: The converted data in the target format. For “hash” target, this is a dict with md5, sha1, sha256, and sha512 hex digest strings.
success (bool): True if conversion succeeded.
error (str): Error message if the conversion failed.

Return type

A dictionary containing

Example

>>> result = DataConverter.static_call(
...     '{"key": "value"}', from_format="json", to_format="base64"
... )
>>> print(result["success"])
True

class calute.tools.data_tools.DateTimeProcessor(name, bases, namespace, /, **kwargs)[source]#

Bases: AgentBaseFn

Date and time processing utilities.

Provides operations for parsing, formatting, and manipulating dates and times. Supports multiple date formats and time delta calculations.

Supported operations:: now: Get current date and time in various formats. parse: Parse a date string into components. delta: Add or subtract time from a date. format: Format a date in various output styles.

static static_call(operation: str, date_string: str | None = None, format: str | None = None, timezone: str | None = None, delta_days: int = 0, delta_hours: int = 0, delta_minutes: int = 0, **context_variables) → dict[str, Any][source]#

Process dates and times with various operations.

Provides operations for getting the current time, parsing date strings, computing time deltas, and formatting dates in various output styles.

Parameters

operation –
The operation to perform. Options: - “now”: Get current date and time in multiple formats. - “parse”: Parse a date string into components. Tries

common formats automatically; use format for a specific strptime format. Falls back to dateutil if available.
- ”delta”: Add or subtract time from a date. Uses date_string as the base (defaults to now).
- ”format”: Format a date in various output styles. Uses date_string as input (defaults to now). If format is provided, uses it as a strftime pattern; otherwise returns all common formats.
date_string – Date string to parse, use as base for delta, or format. Expected to be in ISO format for delta/format operations. For parse, accepts many common formats.
format – Explicit strftime/strptime format pattern. For “parse”, used as the preferred parsing format. For “format”, used as the output format pattern.
timezone – Timezone name. Currently reserved for future use.
delta_days – Number of days to add (positive) or subtract (negative) from the base date. Defaults to 0.
delta_hours – Number of hours to add or subtract. Defaults to 0.
delta_minutes – Number of minutes to add or subtract. Defaults to 0.
**context_variables – Runtime context from the agent (unused).

Returns

For “now”: datetime (ISO), timestamp, formatted (dict with: date, time, datetime, iso, human keys).
For “parse”: parsed (ISO), timestamp, components (dict with: year, month, day, hour, minute, second, weekday).
For “delta”: original (ISO), new (ISO), delta (dict with: days, hours, minutes, total_seconds).
For “format”: formats (dict of format name to value) or: formatted (str) when a specific format is provided.

error (str): Error message if the operation failed.

Return type

A dictionary containing operation-specific results

Example

>>> result = DateTimeProcessor.static_call("parse", date_string="2024-01-15")
>>> print(result["components"]["weekday"])
'Monday'

class calute.tools.data_tools.JSONProcessor(name, bases, namespace, /, **kwargs)[source]#

Bases: AgentBaseFn

JSON data processing and manipulation tool.

Provides operations for loading, saving, validating, querying, and transforming JSON data. Supports both file-based and in-memory JSON operations with simple dot-notation queries.

Supported operations:: load: Load JSON data from a file. save: Save JSON data to a file. validate: Check if data is valid JSON. query: Extract data using dot-notation paths (e.g., “user.name”). transform: Get metadata and formatted output of JSON data.

static static_call(operation: str, data: Any = None, file_path: str | None = None, query: str | None = None, pretty: bool = True, **context_variables) → dict[str, Any][source]#

Process JSON data with various operations.

Performs load, save, validate, query, or transform operations on JSON data. Supports both file-based and in-memory JSON manipulation.

Parameters

operation –
The operation to perform. Options: - “load”: Load JSON from a file. Requires file_path. - “save”: Save data to a JSON file. Requires file_path

and data.
- ”validate”: Check if data is valid JSON (accepts both string and object inputs).
- ”query”: Extract a value from data using dot-notation paths (e.g., “user.name”, “items[0].id”). Requires query and data.
- ”transform”: Get metadata about data including type, keys, length, and optionally pretty-printed output.
data – The JSON data to process. Can be a Python dict/list or a JSON string (for validate). Required for save, validate, query, and transform operations.
file_path – Path to the JSON file for load/save operations.
query – Dot-notation query path for data extraction. Supports bracket notation for array indexing (e.g., “items[0]”).
pretty – Whether to use indented formatting when saving or transforming JSON. Defaults to True.
**context_variables – Runtime context from the agent (unused).

Returns

For “load”: data, success. For “save”: success, file_path. For “validate”: valid (bool), error (if invalid). For “query”: result (extracted value). For “transform”: keys, type, length, formatted (if pretty). - error (str): Error message if the operation failed.

Return type

A dictionary containing operation-specific results

Example

>>> result = JSONProcessor.static_call("validate", data='{"key": 1}')
>>> print(result["valid"])
True

class calute.tools.data_tools.TextProcessor(name, bases, namespace, /, **kwargs)[source]#

Bases: AgentBaseFn

Advanced text processing and manipulation tool.

Provides operations for analyzing, cleaning, extracting patterns, replacing content, and formatting text. Supports regular expressions for pattern matching and extraction.

Supported operations:: stats: Get text statistics (length, word count, character frequency). clean: Remove extra whitespace and optionally matched patterns. extract: Extract patterns like emails, URLs, phone numbers, or custom regex. replace: Replace patterns in text using regex. split: Split text by pattern or whitespace. format: Apply formatting (title, upper, lower, sentence case).

static static_call(text: str, operation: str, pattern: str | None = None, replacement: str | None = None, case_sensitive: bool = True, **context_variables) → dict[str, Any][source]#

Process text with various operations.

Applies the specified text processing operation, ranging from statistical analysis to pattern-based extraction and formatting.

Parameters

text – The input text to process.
operation –
The operation to perform. Options: - “stats”: Compute text statistics including length, word

count, line count, character frequency, and word frequency.
- ”clean”: Remove extra whitespace and optionally remove content matching pattern.
- ”extract”: Extract patterns from text. pattern can be a named shortcut (“emails”, “urls”, “phones”, “numbers”) or a custom regular expression.
- ”replace”: Replace occurrences of pattern in text with replacement. Uses regex matching.
- ”split”: Split text by pattern (regex) or by whitespace if no pattern is given.
- ”format”: Apply text formatting. pattern specifies the format: “title”, “upper”, “lower”, “sentence”, or “no_punctuation”.
pattern – Regex pattern or named shortcut for extract/replace/split/ format operations. Required for “extract” and “replace”.
replacement – Replacement string for the “replace” operation. Defaults to empty string if None.
case_sensitive – Whether pattern matching is case-sensitive. Defaults to True.
**context_variables – Runtime context from the agent (unused).

Returns

For “stats”: length, words, lines, characters_no_spaces,: most_common_chars, most_common_words.

For “clean”: cleaned_text, original_length, cleaned_length. For “extract”: matches (list[str]), count (int). For “replace”: replaced_text, replacements_made (int). For “split”: parts (list[str]), count (int). For “format”: formatted_text (str). - error (str): Error message if the operation failed.

Return type

A dictionary containing operation-specific results

Example

>>> result = TextProcessor.static_call("Hello World!", "stats")
>>> print(result["words"])
2

calute.tools.data_tools

Contents

calute.tools.data_tools#