calute.operators.browser

calute.operators.browser#

Playwright-backed browser state manager for operator tooling.

Provides BrowserManager, which lazily initialises a Chromium browser via Playwright and manages a pool of tracked pages. Each page is represented by a lightweight BrowserPageState dataclass that records the page reference ID, current URL, title, and extracted links.

class calute.operators.browser.BrowserManager(*, headless: bool = True, screenshot_dir: str | None = None)[source]#

Bases: object

Manage a shared Playwright browser and tracked pages.

The manager lazily starts a Chromium browser on the first call that requires a live page. All pages opened through the manager are tracked by a generated ref_id so that subsequent operator tool calls (click, find, screenshot) can address them without re-opening.

_headless#: Whether the browser runs in headless mode.

_screenshot_dir#: Optional directory for screenshot output.

_playwright#: Playwright instance, created lazily.

_browser#: Chromium browser instance, created lazily.

_context#: Default browser context.

_pages#: Mapping of ref_id to live Playwright page objects.

_page_state#: Mapping of ref_id to BrowserPageState.

async click(ref_id: str, *, link_id: int | None = None, selector: str | None = None, text: str | None = None, wait_ms: int = 500) → dict[str, Any][source]#

Click an element on a tracked page.

Exactly one of link_id, selector, or text must be provided to identify the target element.

Parameters

ref_id – Reference identifier of the tracked page.
link_id – Numeric link identifier from the page’s BrowserPageState.link_map. When provided, the browser navigates to the corresponding href.
selector – CSS selector of the element to click.
text – Visible text used to locate the element via Playwright’s get_by_text.
wait_ms – Milliseconds to wait after the click before refreshing the page metadata. Defaults to 500.

Returns

The refreshed page metadata dictionary (same shape as open()).

Raises

ValueError – If the ref_id is unknown, the link_id is not found, or none of the three target parameters is provided.

async find(ref_id: str, pattern: str) → dict[str, Any][source]#

Find text matches on a tracked page.

Performs a case-insensitive regular expression search across the visible body text of the referenced page.

Parameters

ref_id – Reference identifier of the tracked page to search.
pattern – Regular expression pattern to match against the page’s visible text content.

Returns

A dictionary with the ref_id, the pattern used, the total match_count, and up to 20 matching strings.

Raises

ValueError – If the ref_id does not correspond to a tracked page.

list_pages() → list[dict[str, str]][source]#

Return summaries for tracked pages.

Returns: A list of dictionaries, each containing the ref_id, url, and title of a tracked page, sorted by ref_id.

async open(*, url: str | None = None, ref_id: str | None = None, wait_ms: int = 500) → dict[str, Any][source]#

Open a URL or inspect an existing tracked page.

Either url or ref_id must be provided. When url is given, a new page is created (or an existing page navigated) and its metadata is returned. When only ref_id is given, the currently loaded page is re-inspected.

Parameters

url – URL to navigate to. A new tracked page is created when no ref_id is supplied alongside the URL.
ref_id – Reference identifier of a previously opened page to re-inspect without navigating.
wait_ms – Milliseconds to wait after navigation before extracting page metadata. Defaults to 500.

Returns

A dictionary containing the page ref_id, current URL, title, a truncated content preview (first 2000 characters), and a list of extracted links with numeric IDs.

Raises

ValueError – If neither url nor ref_id is provided, or if the given ref_id does not match any tracked page.

async screenshot(ref_id: str, *, path: str | None = None, full_page: bool = True) → dict[str, Any][source]#

Capture a screenshot of a tracked page.

Parameters

ref_id – Reference identifier of the tracked page to capture.
path – Optional file path for the screenshot. If omitted, a default path inside the configured screenshot directory (or a temporary directory) is used.
full_page – When True, capture the entire scrollable page instead of just the visible viewport. Defaults to True.

Returns

A dictionary containing the ref_id, saved file path, and the full_page flag.

Raises

ValueError – If the ref_id is not tracked.

class calute.operators.browser.BrowserPageState(ref_id: str, url: str, title: str = '', link_map: dict[int, str] = <factory>)[source]#

Bases: object

Tracked state for an opened browser page.

ref_id#

Unique reference identifier used to address this page across operator tool calls.

Type: str

url#

The last-known URL loaded by this page.

Type: str

title#

The page title extracted after navigation.

Type: str

link_map#

Mapping of numeric link IDs to their href values, populated after each page load or refresh.

Type: dict[int, str]

link_map: dict[int, str]#

ref_id: str#

title: str = ''#

url: str#

calute.operators.browser

Contents

calute.operators.browser#