iTranslated by AI
Tool Calling in ollama-python
Ollama supports tool calling, a feature that allows Large Language Models (LLMs) to call external tools or functions. In this article, we will examine how functions are converted and passed to Ollama, and explore alternative methods for unsupported models.
Basic Usage
The following is a basic example of tool calling, with slight modifications to examples/tools.py.
from ollama import chat
MODEL = "qwen3:4b"
PROMPT = "りんごが3個、みかんが5個あります。果物は全部で何個ですか?"
def add_two_numbers(a: int, b: int) -> int:
"""
Add two numbers
Args:
a (int): The first number
b (int): The second number
Returns:
int: The sum of the two numbers
"""
return int(a) + int(b)
print(PROMPT)
messages = [{"role": "user", "content": PROMPT}]
tools = [add_two_numbers]
tools_dict = {f.__name__: f for f in tools}
# Pass the function objects directly to the tools argument
response = chat(MODEL, messages=messages, tools=tools)
# If the model determines that a tool should be used
for tool_call in response.message.tool_calls or []:
name = tool_call.function.name
args = tool_call.function.arguments
print("Calling function:", name)
print("Arguments:", args)
# Execute the actual function
output = tools_dict[name](**args)
print("Output:", output)
りんごが3個、みかんが5個あります。果物は全部で何個ですか?
Calling function: add_two_numbers
Arguments: {'a': 3, 'b': 5}
Output: 8
In this example, a simple addition function named add_two_numbers is defined and passed to the tools argument of the chat method. In response to the user's question, the model determines that the add_two_numbers function should be called with the arguments {"a": 3, "b": 5}, and returns that information included in response.message.tool_calls.
Comparison
For calculations of this level, it is possible for the model itself to directly provide the answer.
from ollama import chat
MODEL = "qwen3:4b"
PROMPT = "りんごが3個、みかんが5個あります。果物は全部で何個ですか?"
print(PROMPT)
messages = [{"role": "user", "content": PROMPT}]
response = chat(MODEL, messages=messages)
print(response.message.content)
りんごが3個、みかんが5個あります。果物は全部で何個ですか?
りんご(3個)とみかん(5個)を合わせて計算すると、
**3 + 5 = 8** 個になります。
答え:**8個**。
Tool calling is different in that it explicitly delegates the calculation to an external source. More complex calculations, data processing, and calling external APIs can be implemented using the same mechanism.
Serialization Process of Tools (Functions) in the ollama-python Library
Python functions passed to the tools argument are automatically converted (serialized) inside the library into the JSON schema format required by the Ollama API.
The conversion from a function to a JSON schema is mainly performed in the following three steps:
-
Structural Analysis of the Function: It uses Python's
inspectmodule to extract information such as the function name, docstring, arguments, and type hints. -
JSON Schema Generation: Based on the extracted information, a
pydantic.BaseModelis dynamically generated, and its.model_json_schema()method is called to generate the JSON schema. -
Integration of Information: Descriptions parsed from the docstring are added to the generated schema, and the final
Toolobject is constructed.
Entry Point: ollama._client.Client.chat
The chat method passes the received tools argument to the _copy_tools helper function.
def chat(self, ..., tools, ...):
return self._request(
# ...
json=ChatRequest(
# ...
tools=list(_copy_tools(tools)),
# ...
).model_dump(exclude_none=True),
stream=stream,
)
Function Dispatching: ollama._client._copy_tools
The _copy_tools function iterates through each element in the tools list. If an element is callable (i.e., a function), it calls convert_function_to_tool to perform the conversion. In the case of a dictionary format, it validates it using Tool.model_validate.
def _copy_tools(tools: ...):
for unprocessed_tool in tools or []:
yield convert_function_to_tool(unprocessed_tool) if callable(unprocessed_tool) else Tool.model_validate(unprocessed_tool)
The Core of Serialization: ollama._utils.convert_function_to_tool
convert_function_to_tool leverages the powerful features of Pydantic to generate the parameters object (JSON Schema) that defines the function's arguments.
The following is an excerpt highlighting only the essential points for explanation, with details omitted using ....
def convert_function_to_tool(func: Callable) -> Tool:
# Analyze the docstring and extract the description
parsed_docstring = _parse_docstring(inspect.getdoc(func))
# Dynamically generate a Pydantic model from the function signature and get the JSON Schema
schema = type(...).model_json_schema()
# Remove Optional(T | None) etc. from required and fill in description/type
for name, prop in schema.get('properties', {}).items():
...
schema['properties'][name] = {'description': parsed_docstring[name], 'type': ...}
# Repack into Tool(function) format (parameters unpacks the schema as is)
tool = Tool(... parameters=Tool.Function.Parameters(**schema), ...)
return Tool.model_validate(tool)
The process is as follows:
-
Dynamic Class Generation: Using
type(), it dynamically creates a one-offpydantic.BaseModelsubclass based on the function's signature (arguments and type hints) and docstring.-
__annotations__: This is the most important attribute used by Pydantic to define the model's fields (= function arguments) and their types. -
__doc__: The class's docstring becomes the top-leveldescriptionin the generated JSON schema.
-
-
Calling
model_json_schema(): It calls.model_json_schema()on the dynamically generated class object. This generates a complete schema dictionary conforming to the JSON Schema specification all at once. This dictionary contains all necessary information forparameters, such astype,properties, andrequired. -
Applying to
parameters: Finally, when constructing theToolobject, it unpacks the schema dictionary generated in step 2 (e.g.,Tool.Function.Parameters(**schema)) and sets it to theparametersfield.
In other words, model_json_schema() does not directly generate the parameters object itself, but rather plays the role of generating the complete schema dictionary that serves as its source.
This implementation is a prime example of the powerful metaprogramming capabilities of dynamic languages like Python. Users of ollama-python can simply define Python functions as they normally would, and the library automatically handles the conversion to the format required by the API behind the scenes.
Conversion Example
Let's convert the add_two_numbers function we used earlier using convert_function_to_tool and output the result in JSON.
from ollama._utils import convert_function_to_tool
def add_two_numbers(a: int, b: int) -> int:
"""
Add two numbers
Args:
a (int): The first number
b (int): The second number
Returns:
int: The sum of the two numbers
"""
return int(a) + int(b)
converted_tool = convert_function_to_tool(add_two_numbers)
print(converted_tool.model_dump_json(indent=2))
{
"type": "function",
"function": {
"name": "add_two_numbers",
"description": "Add two numbers",
"parameters": {
"type": "object",
"defs": null,
"items": null,
"required": [
"a",
"b"
],
"properties": {
"a": {
"type": "integer",
"items": null,
"description": "The first number",
"enum": null
},
"b": {
"type": "integer",
"items": null,
"description": "The second number",
"enum": null
}
}
}
}
}
You can see that the function name, docstring, arguments, and type hints have been correctly converted into a JSON schema.
Building the API Request
The entire ChatRequest object is converted into a Python dictionary via the .model_dump() method. During this process, .model_dump() is also called recursively on any nested Tool objects. Ultimately, this large dictionary is serialized into a JSON string by the httpx library and sent as the request body to the Ollama server.
ChatRequest.model_dump() returns a Python dictionary similar to the one shown below. Note that the value of the tools field is a list of dictionaries (list[dict]).
{
"model": "qwen3:4b",
"messages": [...],
"tools": [
{
"type": "function",
"function": {
"name": "add_two_numbers",
"description": "Add two numbers",
"parameters": { ... }
}
}
],
...
}
In this manner, although JSON is what's actually being exchanged with the Ollama server, the ollama-python library handles the conversion automatically, allowing you to use it without needing to worry about the underlying details.
Substitution by Code Generation
Tool calling can only be used with models that support it. On the other hand, it is possible to substitute it by having the model generate code (or function call information).
From a security perspective, it is safer to have the model return only the function name and arguments in JSON, rather than having it generate Python code.
[{"name": "...", "arguments": {...}}]
This format is very similar to the tool_calls returned by actual tool calling, making it easy to migrate to real tool calling later.
import re
import json
from ollama import chat
MODEL = "gemma3:4b"
def add_two_numbers(a: int, b: int) -> int:
return int(a) + int(b)
tools_dict = {"add_two_numbers": add_two_numbers}
SYSTEM = """
The following tools are available:
- name: add_two_numbers, arguments: {"a": int, "b": int}
You must output only the following JSON (no additional text):
[{"name": "...", "arguments": {...}}, ...]
""".strip()
PROMPT = "There are 3 apples and 5 oranges. How many fruits are there in total?"
print(PROMPT)
messages = [
{"role": "system", "content": SYSTEM},
{"role": "user", "content": PROMPT},
]
response = chat(MODEL, messages=messages)
content = response.message.content
if m := re.match(r"```json\n(.*?)\n```", content, re.DOTALL):
content = m.group(1)
print("Response:", content)
for tool_call in json.loads(content):
name = tool_call["name"]
args = tool_call["arguments"]
print("Calling function:", name)
print("Arguments:", args)
# Execute the actual function
output = tools_dict[name](**args)
print("Output:", output)
There are 3 apples and 5 oranges. How many fruits are there in total?
Response: [{"name": "add_two_numbers", "arguments": {"a": 3, "b": 5}}]
Calling function: add_two_numbers
Arguments: {'a': 3, 'b': 5}
Output: 8
This code mimics tool calling as a regular chat, so there are no complex mechanisms behind the scenes, making it suitable for understanding the concept.
Substitution by Structured Output
The example above simply instructs the model to return JSON, so the model might include unnecessary text or slightly change key names, causing parsing failures. In such cases, using structured output (output constraints via JSON Schema) can improve the stability of the output.
Ollama allows the use of structured output by passing a Pydantic type definition to the format parameter.
from typing import Any
from pydantic import BaseModel
from ollama import chat
MODEL = "gemma3:4b"
def add_two_numbers(a: int, b: int) -> int:
return int(a) + int(b)
tools_dict = {"add_two_numbers": add_two_numbers}
class ToolCall(BaseModel):
name: str
arguments: dict[str, Any]
class ToolCalls(BaseModel):
calls: list[ToolCall]
format = ToolCalls.model_json_schema()
SYSTEM = """
The following tools are available:
- name: add_two_numbers, arguments: {"a": int, "b": int}
""".strip()
PROMPT = "There are 3 apples and 5 oranges. How many fruits are there in total?"
print(PROMPT)
messages = [
{"role": "system", "content": SYSTEM},
{"role": "user", "content": PROMPT},
]
# Constrain the output by passing a JSON Schema to format
response = chat(MODEL, messages=messages, format=format)
content = response.message.content.strip()
print("Response:", content)
tool_calls = ToolCalls.model_validate_json(content)
for tool_call in tool_calls.calls:
name = tool_call.name
args = tool_call.arguments
print("Calling function:", name)
print("Arguments:", args)
# Execute the actual function
output = tools_dict[name](**args)
print("Output:", output)
There are 3 apples and 5 oranges. How many fruits are there in total?
Response: [{"name": "add_two_numbers", "arguments": {"a": 3, "b": 5}}]
Calling function: add_two_numbers
Arguments: {'a': 3, 'b': 5}
Output: 8
Since both tool calling and structured output are the same in that they return JSON, we are mimicking the tool calling mechanism by instructing the model to return JSON with equivalent content.
Summary
Tool calling in ollama-python works by analyzing Python functions with inspect, converting them into a JSON Schema via Pydantic, and then passing that to Ollama.
Even for models that do not support tool calling, substitution is possible by having the model return a JSON object consisting of the "function name + arguments."
Related Articles
I investigated how Ollama implements structured output.
MCP can be described as a more organized version of tool calling; this article compares them. (Function calling is another name for tool calling and refers to the same thing.)
References
While we mimicked tool calling with structured output here, the reverse pattern (mimicking structured output with tool calling) also exists.
MCP can sometimes lead to issues with bloated context consumption, and code generation may be more efficient in some cases.
Discussion