calute.llms.ollama

calute.llms.ollama#

Ollama local LLM provider implementation.

This module provides integration with Ollama for running large language models locally. Ollama is an open-source tool for running LLMs on your own machine, supporting models like Llama, Mistral, CodeLlama, and many others.

The module handles: - HTTP communication with the Ollama server via async httpx - Both chat-style (/api/chat) and generate-style (/api/generate) endpoints - Streaming response processing with callback support - Automatic model metadata fetching (context length, parameters) - Configurable timeout for long-running generations

Supported models include: - llama2, llama3 (default: llama2) - mistral, mixtral - codellama - phi, phi3 - Any model available in your local Ollama installation

Typical usage example:

from calute.llms.ollama import OllamaLLM from calute.llms.base import LLMConfig

# Ensure Ollama is running locally (ollama serve) config = LLMConfig(

model=”llama3”, base_url=”http://localhost:11434”, temperature=0.7, max_tokens=2048,

)

async with OllamaLLM(config) as llm:: response = await llm.generate_completion(“Explain recursion”) content = llm.extract_content(response) print(content)

Note

Requires the httpx package to be installed: pip install httpx

Also requires Ollama to be installed and running: https://ollama.ai

class calute.llms.ollama.LocalLLM(config: calute.llms.base.LLMConfig | None = None, **kwargs)[source]#

Bases: OllamaLLM

Alias for OllamaLLM for backward compatibility.

LocalLLM is a convenience alias that points to OllamaLLM. It provides backward compatibility for code that used the LocalLLM name before the rename to the more specific OllamaLLM.

All functionality is identical to OllamaLLM. New code should prefer using OllamaLLM directly.

Example

# These are equivalent: llm1 = LocalLLM(model=”llama3”) llm2 = OllamaLLM(model=”llama3”)

calute.llms.ollama

Contents

calute.llms.ollama#