Local Models vs Cloud: A Tool-Calling Reality Check
February 2, 2026
After getting OpenClaw running in a VM (see Part 1), I set out to use local LLMs via Ollama. What followed was an educational journey through the current limitations of local model tool-calling.
The Memory Search API Key Problem
The first issue: OpenClaw's memory_search tool requires embeddings,
which by default need an OpenAI API key. I tried several approaches:
Option 1: Disable the memory plugin entirely
{
"plugins": {
"slots": {
"memory": "none"
}
}
}
Option 2: Use local embeddings
{
"agents": {
"defaults": {
"memorySearch": {
"provider": "local"
}
}
}
}
I also tried configuring Ollama for embeddings, but the memorySearch.provider
only accepts "openai" or "local", not custom endpoints.
Model Tool-Calling Issues
With memory sorted, I tested multiple local models. All had issues with OpenClaw's complex tool schemas:
| Model | Result |
|---|---|
qwen2.5:32b | Called tools but with wrong parameters |
qwen3:30b-a3b | Reasoning model, tool format issues |
gemma3:27b-it-qat | "does not support tools" error |
qwen2.5-coder:14b | Still had parameter issues |
The Gemma3 error was explicit:
"errorMessage": "400 registry.ollama.ai/library/gemma3:27b-it-qat does not support tools"
The Qwen models had a subtler problem: they would recognize they needed to call
a tool but send empty {} for complex nested parameters like cron job
specifications. They'd get stuck in loops trying repeatedly with the same empty parameters.
Trying Cloud Models
I switched to OpenCode Zen as the primary provider:
{
"agents": {
"defaults": {
"model": {
"primary": "opencode/kimi-k2.5",
"fallbacks": ["ollama/qwen2.5-coder:14b"]
}
}
}
}
Rate Limiting with Free Tier
The free tier (opencode/kimi-k2.5-free) hit rate limits quickly:
"errorMessage": "429 Request didn't generate first token before the given deadline, the service is overloaded"
Switching to the paid tier opencode/kimi-k2.5 resolved this.
Kimi Still Struggled with Complex Tools
Even the cloud Kimi model had issues with complex tool schemas. When trying to add cron jobs, it repeatedly sent empty job objects despite its "thinking" showing it knew what parameters were needed.
The Solution: Claude Haiku
I finally switched to opencode/claude-haiku-4-5 for reliable tool calling:
{
"agents": {
"defaults": {
"model": {
"primary": "opencode/claude-haiku-4-5",
"fallbacks": ["ollama/qwen2.5-coder:14b"]
}
}
}
}
This worked reliably. The local Ollama model serves as a fallback for simpler tasks or when the cloud is unavailable.
Lessons Learned
- Local model tool-calling is immature. Even capable models like Qwen 2.5 32B struggle with complex nested tool schemas. Simple chat works; complex agentic workflows don't.
- Ollama tool support varies by model. Some models (Gemma3) explicitly don't support tools. Check before assuming.
- Memory search requires embeddings. Either disable it, use local embeddings, or accept the OpenAI API key requirement.
- Claude models handle tools reliably. When tool-calling reliability matters, Claude (even Haiku) is a safer choice than experimental alternatives.
Final Model Configuration
| Component | Value |
|---|---|
| Primary Model | opencode/claude-haiku-4-5 |
| Fallback Model | ollama/qwen2.5-coder:14b |
| Memory Search | Disabled |
Part 2: Local Models vs Cloud: A Tool-Calling Reality Check (this post)
Part 3: Running OpenClaw: Security, Automation & Maintenance